Energy Based Logic Mining Analysis with Hopfield Neural Network for Recruitment Evaluation

An effective recruitment evaluation plays an important role in the success of companies, industries and institutions. In order to obtain insight on the relationship between factors contributing to systematic recruitment, the artificial neural network and logic mining approach can be adopted as a data extraction model. In this work, an energy based k satisfiability reverse analysis incorporating a Hopfield neural network is proposed to extract the relationship between the factors in an electronic (E) recruitment data set. The attributes of E recruitment data set are represented in the form of k satisfiability logical representation. We proposed the logical representation to 2-satisfiability and 3-satisfiability representation, which are regarded as a systematic logical representation. The E recruitment data set is obtained from an insurance agency in Malaysia, with the aim of extracting the relationship of dominant attributes that contribute to positive recruitment among the potential candidates. Thus, our approach is evaluated according to correctness, robustness and accuracy of the induced logic obtained, corresponding to the E recruitment data. According to the experimental simulations with different number of neurons, the findings indicated the effectiveness and robustness of energy based k satisfiability reverse analysis with Hopfield neural network in extracting the dominant attributes toward positive recruitment in the insurance agency in Malaysia.


Introduction
Systematic recruitment evaluation requires an optimal decision support system in ensuring the high-quality services by the insurance agents that will propel the success of the insurance companies in Malaysia. In order to hire the right insurance agent, several independent bodies such as Life Insurance Management Research Association (LIMRA) introduced various tests to screen potential candidates [1]. Corporations in insurance company or agencies are harnessing aggregated operational data to facilitate their recruitment activities based on the predictions as mentioned in [2,3]. Candidates are required to attend a pre-requisite seminar before they can advance to the next stage of the interview selection. Therefore, a comprehensive rule is needed to classify the attendance of the recruit. An insurance company need to expand their recruitment market by getting more candidates [4]. Consequently, the higher amount of attendance during this prerequisite seminar will increase the chances for more agents to get contracted. The main challenge faced by recruitment personnel is low number of attendance record retention during pre-requisite seminar. In some cases, the number of candidates who does not committed to their attendance can reach up to 90% to 95% of total candidate. In other words, the company is losing resources to physically accommodate the wrong candidates. Therefore, selecting the right candidate based on their preliminary attributes will reduce work task and increase the effectiveness of the recruitment team. Conventional recruitment systems are generally based on machine learning, regression analysis and decision tree discovery [5]. The methods perform well in the classification with specific computations. However, an alternative approach is needed to extricate the relationships of the factors contributing to the positive or negative recruitment. The term positive recruitment in this case is the ability for the human resource (HR) to choose the right candidate that will attend the pre-requisite seminar. Hence, more comprehensive "advice" is required to assist the HR personnel to make an informed decision. This can be achieved by capitalizing artificial neural network (ANN) and logic mining method.
The advancement of data mining in recruitment has been growing due to the development of Industrial Revolution 4.0. Data mining is generally categorized as an association, clustering, classification and prediction based on any given data sets [6,7]. Ref. [4] proposed a data mining technique by integrating the decision tree in predicting the job performance and personnel selection. Thus, the knowledge extraction approach via decision tree produced an acceptable degree of accuracy. However, it would be difficult to analyze the behavior of the data from the attributes. In this context, behavior of the data can be defined as the pattern of the data that leads to the specific desired output. In addition, the application of machine learning approach such as support vector machine paradigm in predicting the risk in HR has been discussed in the work of [8]. The results were acceptable, but the method only focuses on the classification only. Thus, the real information from each of the attributes cannot be interpreted comprehensively. Ref. [9] extended the decision tree approaches over the k means clustering techniques in screening the job applicants to the industry. In relation to this, Ref. [10] utilized an adaptive selection model approach in recruitment. Both methods provided good results in term of error evaluations. In addition, the underlying behavior of the data set need to be extracted separately. Next, [11] applied the back-propagation network in forecasting management associate's recruitment rates for different enterprises. In their work, the probability of the attributes is computed before being trained and tested by back propagation neural network to check the probability of a recruit to stay in that firm. A recent work by [12] has indicated the ability of the recurrent neural network (RNN) as the building block for ability-aware person job fit neural network (APJFNN) model in training an industrial data set in China. The proposed model recorded a better accuracy compared to the state-of-the-art approaches such as decision tree, linear regression and gradient boosting decision tree. Since most of the aforementioned data mining techniques integrate the statistical measures, an alternative method will be appropriate in facilitating the learning and testing phase of the recruitment data set.
Based on the recent theoretical developments of artificial neural network (ANN), have revealed the capability in various data mining tasks such as classification and clustering. For instance, successive geometric transformations model (SGTM) is a neural-like model as proposed by Tkachenko and Izonin [13], which being applied in [14] for the electric power consumption prediction for combined-type industrial areas. Izonin et al. [14] has successfully demonstrated the efficiency of SGTM as compared with the statistical regression analysis. In another development, Tkachenko et al. [15] further extended the work by proposing general regression neural network with successive geometric transformation model (GRNN-SGTM) ensemble. The work has increased the predictive capability based on the accuracy in missing Internet of Thing (IoT) data mining. Radial basis neural network (RBFNN) is a variant of multilayer feedforward ANN that can be explored in various application due to the forecasting capability in several works. Villca et al. [16] utilized a radial basis function neural network (RBFNN) in predicting the optimum chemical composition during mining processes especially in copper tailings flocculation processes. Mansor et al. [17] incorporates Boolean 2 satisfiability logical representation into RBFNN by obtaining the special parameters such as width and centre. The effective wind speed horizon has been forecasted with higher level of correctness as shown in the work of Madhiarasan [18]. Some works leveraged the Adaline neural network approach in var-ious forecasting tasks such as in power filter optimization [19] and interior permanent magnet synchronous motor (IPMSM) parameter prediction [20]. Both work of Sujith and Padma [19] and Wang et al. [20] utilized an Adaline neural network as a classifier for the parameters involved in industrial control problem. The deep convolutional neural network (DCNN) is a variant of powerful ANN, with multi-layer hidden neurons that play important role for the data prediction. Li et al. [21] utilized the DCNN in assessing the remaining useful life (RUL) assessment and forecast to extract the insight on the maintenance factors for equipment and machineries in industry. Sun et al. [22] applied the DCNN in the city traffic flow management towards the intelligent transport system (ITS). Houidi et al. [23] proposed the DCNN in forecasting the appropriate pattern during the non-intrusive load monitoring. These works have been successful in prediction task based on the high accuracy values obtained after the simulations.
The Hopfield neural network (HNN) is regarded as one of the earliest ANN that imitate how the brain computes. HNN was proposed by Hopfield [24] to solve various optimization problems. HNN is a class of recurrent ANN without any hidden layer that demonstrates high-level learning behavior. This includes an effective learning and retrieval mechanism. An important property of HNN is the energy minimization of the neurons whenever the neuron changes state. Even though HNN has a simple structure (without hidden layer), HNN remains relevant to numerous field of studies such as optimization of ANN [25,26], bio-medical imaging [27], engineering [28], Mathematics [29], communication [30] and data mining [31]. An important property of HNN is the energy minimization of the neurons whenever the neuron changes state. Since the traditional HNN is prone to a few weaknesses such as lower neuron interpretability [32], logic programming was embedded to HNN as a single intelligent unit [33,34]. The work of logic programming in HNN capitalizes the effective logical rule being trained and retrieved by HNN with the aim to generate the solutions with the minimum energy. In particular, Sathasivam [35] introduced the Horn satisfiability (HORNSAT) in HNN with the dynamic neuron relaxation rates. It was observed that the proposed model obtained higher global minima ratio for the dynamic neuron relaxation as opposed to the constant relaxation rate. Kasihmuddin et al. [36] further developed the k satisfiability (kSAT) logic programming in HNN, with the priority given to improve the learning phase of the model via genetic algorithm. The work of [36] is a major breakthrough of kSAT as a systematic Boolean satisfiability logical representation, without any redundant structure. The simulation has confirmed the improvement of kSAT logic programming in HNN in attaining optimal final states that drives towards global minimum solutions as compared with [35]. Mansor and Sathasivam [37] formulated a variant of kSAT, known as 3 satisfiability (3SAT) logic programming, specifically as a representation of 3-dimensional logical structure. The systematic logical rule as 3SAT as coined in [37] can be seen to comply with the HNN as proven by the performance evaluation metrics such as global minima ratio, Hamming distance and computation (CPU) time. Velavan et al. [38] proposed mean field theory (MFT) by implementing Boltzmann machine and output squashing via hyperbolic activation function for Horn satisfiability logic programming in HNN. Theoretically, the work in [38] has slightly outperformed [35] even though the similar logic structure being utilized. Kasihmuddin et al. [39,40] extended the restricted maximum k-satisfiability (MAXkSAT) programming in HNN, where the emphasis was given to the unsatisfiable logical rule under different level of neuron complexities. Based on the study reported in [40], MAXkSAT logical rule performed optimally as compared to the Kernel Hopfield neural network (KHNN). The effectiveness of various logic programming in HNN has been proved in previously aforementioned works, which bring another perspective of representing the real data in the form of logical representation. In short, we need a well-established logical rule that can represents the behavior of the recruitment data.
Logic mining is a variant of data extraction process by leveraging the Boolean logic and ANN. Ref. [41] has proposed logic mining method in HNN by implementing reverse analysis method. The proposed logic mining technique is capable of extracting logical rule among neurons. The early work on reverse analysis method by incorporating Horn satisfiability logical rule in neural network was introduced by [42] in processing the customer demand from a supermarket via reverse analysis simulation. However, the existence of redundancy in the extracted logical representation was found to be non-systematic due to the lack of interpretability of the behavior of a particular real data set. Hence, a notable model known as k satisfiability (kSAT) in reverse analysis method has been specifically implemented in various applications. An efficient medical diagnosis of non-communicable diseases such as the Hepatitis, Diabetes and Cancer have implemented in Kasihmuddin et al. [43] by employing 2 satisfiability reverse analysis (2SATRA). The proposed logic mining technique has extracted the behavior or symptoms of the non-communicable diseases with more than 83% accuracy. Moreover, Kho et al. [44] has effectively developed 2SATRA method in extracting the optimal relationship between the strategies and gameplay in League of Legends (LoL), a variant of well-known electronic sport (e-sport). The extracted logic with the highest accuracy has been extracted that will benefit the e-sport coaches and players. In other development, Alway et al. [31] extracted the logical rule that represent the behaviour of the prices of the palm oil and other commodities by using 2SATRA and HNN. It was found that the systematic weight management has affected the optimal induced logic as a representation of the palm oil prices data set. These works have emphasized the logic mining in terms of systematic 2SAT representation, with a good capability in exploring the behaviour of the data. In addition, Zamri et al. [45] proposed 3 satisfiability reverse analysis (3SATRA) to extract the prioritized factor in order to grant or revoke employees resources applications in an International online shopping platform. The work in [45] has recorded 94% of accuracy, indicating the effectiveness of the logic mining approach over the conventional methods. Hence, the implementation of various k satisfiability (kSAT) in reverse analysis method in extracting the data set is still limited, especially in recruitment evaluation. Thus, in this research a novel Energy based k satisfiability reverse analysis method will be developed to extract the correct recruitment factors that lead to positive recruitment in an insurance company in Malaysia. By using the extracted logical rule, the recruitment personnel are expected to properly strategize their recruitment force and target the insurance agent with good quality.
The contributions of this work are as follows: (a) to convert the E recruitment data set into systematic form, based on kSAT representation. (b) To propose the energy based k satisfiability reverse analysis method as an alternative approach in extracting the relationships between the factors or attributes that contribute to the positive recruitment based on E recruitment data obtained from an insurance agency in Malaysia. (c) To assess the capability and accuracy of two variants of the proposed method based on 2SAT and 3SAT logical representation as compared to Horn satisfiability in completing the E recruitment data extraction with different number of clauses. The performance evaluations metrics will be adopted to evaluate the effectiveness of both proposed method and logical representations as an alternative data extraction method to the E recruitment system. The general implementation of energy based k satisfiability reverse analysis method and HNN in extracting E recruitment (ER) data is illustrated in Figure 1.

k Satisfiability Representation
k satisfiability (kSAT) is a logical representation that consist of strictly k variables per clause [36]. The properties of kSAT can be summarized as follows: i.
A set of m logical variables, 1 2 , , , m x x x  . Each variable store a bipolar value of This study considers 2,3 k = . Hence the clear definitions of where ( )  This paper is organized as follows. Secrion 2 discusses the theoretical aspect of k satisfiability representation in general. Secrion 3 emphasizes the important implementation of k satisfiability representation in discrete Hopfield neural network by introducing the essential formulations used during the learning and testing phase of our model. Following this, Secrion 4 highlights the fundamental concepts of energy based k satisfiability reverse analysis method, as the data extraction paradigm for E recruitment evaluation data set. Secrion 5 presents the implementation of energy based k satisfiability reverse analysis method in extracting the insightful relationship among the attributes of E recruitment data set. Following that, Sections 6 and 7 focus on the formulation of performance evaluation metrics and simulation setup involved in this work. Secrion 8 presents the results and discussion based on the performance of our proposed energy based k satisfiability reverse analysis method model in learning and testing using the E recruitment datasets. Secrion 9 reports concluding remarks of this work.

k Satisfiability Representation
k satisfiability (kSAT) is a logical representation that consist of strictly k variables per clause [36]. The properties of kSAT can be summarized as follows: A set of m logical variables, x 1 , x 2 , . . . , x m . Each variable store a bipolar value of x i ∈ {1, −1} that exemplify TRUE and FALSE respectively. ii.
Each variable in x i can be set of literals where positive literal and negative literal is defined as x m and ¬x m respectively. iii.
Consist of a set of n distinct clauses, C 1 , C 2 , C 3 , . . . , C n . Each C i is connected by logical AND (∧). Every k literals will be form a single C i and connected by logical OR (∨). By using property (i) until (iii), we can define the explicit definition of kSAT formulation or P kSAT : This study considers k = 2, 3. Hence the clear definitions of C (k) i for k = 2 and k = 3 are as follows: C where C (2) i and C (3) i are 2SAT and 3SAT clause respectively. Consider the formulation of P kSAT that consist of variable A, T, G, E, O and S where,¬ A signifies the negation of variable A.
The example of P kSAT for both k = 2 and k = 3 are: According to Equation (4), the possible assignment that leads to P 2SAT = 1 is (A,T,E, O,G,S) = (1, 1, 1, 1, 1, 1). In this case, the P 2SAT is said to be satisfiable because all are C The formulation for both P 2SAT formulation must be represented in conjunctive normal form (CNF) because the satisfiability nature of CNF can be conserved compared to other form such as disjunctive normal form (DNF). Equations (4) and (5) do not consider any redundant variables that may result in unsatisfiable nature of Equations (2) and (3). In addition, we only consider the assignment that leads to P 2SAT = 1, interested reader may refer to [34,40] for P 2SAT = −1. In this paper, we represent the information of the datasets in the form of attributes. The attributes are defined as variables in P 2SAT and become the symbolic rule for ANN.

k Satisfiability in Discrete Hopfield Neural Network
Hopfield neural network (HNN) was popularized by J. J Hopfield in 1985 to solve various optimization problem [24]. Consider the conventional HNN model that consist of mutually connected neurons S i = (S 1 , S 2 , S 3 , . . . , S N ) where S i = {−1, 1}. All S i are assumed to be updated asynchronously according to the following equation: where W ij is the synaptic matrix that connects neuron j to i (of the total of N interconnected neurons) with pre-determined bias β. In HNN, the state produced by Equation (6) signifies the possible solution for any given optimization problem. Note that, the final state obtained will be further verified by using final energy analysis where the aim of HNN is to find the final state the corresponding to global minimum energy. By capitalizing the updating property in Equation (6), P kSAT is reported to be compatible to the structure of HNN [46]. In this work, P kSAT is embedded as a symbolic instruction to the HNN by assigning each neuron with a set of m variables. For any given P kSAT that is embedded into HNN, the cost function of E P kSAT for P kSAT is defined as follows [36]: where n and m in Equation (7) signify the number of clause and variables in P kSAT respectively. Note that, the inconsistencies of ¬P kSAT are given as: where S X is the neuron for variable X. The cost function is E P kSAT = 0 when all C i in Equation (2) reads C i = 1. Note that, E P kSAT = 0 if at least one C i = −1. The primary aim of the hybrid HNN is to minimize the value of E P kSAT as the number P kSAT clause increase. The updating rule h i (t) to the final state of HNN that incorporates the proposed P kSAT is given as follows: ij and W (1) i are third (three neuron connections per clause), second (two neuron connections per clause) and first order (one neuron connection per clause) synaptic weight respectively. The synaptic weight of P kSAT in HNN is symmetrical and zero self-connection: W In practice, optimal value for (12)- (14) can be obtained by comparing Equation (7) with the following final energy: The step of comparing the final energy with the cost function is known as Wan Abdullah method [33]. Note that, the final energy H P kSAT in Equations (15) and (16) can be obtained by considering the asynchronous update of the neuron state for specific P kSAT . The quality of the neuron states based on the energy function as shown in Equation (15) for k = 2 and Equation (16) is specialized for k = 3. H P kSAT can be further updated by computing the differences in the energy produced by local field. Consider the neuron update for P kSAT at time t. The final energy that P kSAT is given as follows: H P kSAT (t) refers to the energy before being updated by q that store P kSAT patterns. Thus, the updated H P kSAT (t + 1) will verify the final state produced after the learning and retrieval phase. The differences of energy level are given as: By substituting Equations (17) and (18), By simplifying it and being compared with Equation (10), Based on Equation (20), it can be concluded HNN will reach a state whereby the energy cannot be reduced further. The similar states will indicate the optimized final states that leads to ∆H P KSAT = 0. This equilibrium will ensure the early validation of our final state of the neurons. Figure A1 shows the implementation of P kSAT in HNN. Another interesting note about the implementation of P kSAT in HNN is the ability of the model to calculate the minimum energy supposed to be H min P kSAT . Minimum energy supposed to be can be defined as the absolute minimum energy achieved during retrieval phase. H min P kSAT can be obtained by using the following formula: where γ = n C The previous H min P kSAT computation in [35][36][37][38] requires the randomized states and energy function, which already being simplified as in Equation (21) for P kSAT . By taking into account the value obtained from Equations (15) and (16), the final state of HNN is considered optimal if the network satisfies the following condition: where ∂ is a tolerance value pre-determined by the user. Worth mentioning that the final state of the Equations (15) and (16) will be converted into induced logic. Several studies implemented other type of logical rule such as HORNSAT [35] and improved HNN such as mean field theory (MFT) [38] will be converted into induced logic during logic extraction. In this work, the final states that correspond to the global minimum energy will be the focus during the retrieval phase of HNN model. The idea of energy based HNN will be extended in improving the existing logic mining by energy verification for each of the induced logic extracted by the model. Hence, the newly proposed energy based k satisfiability reverse analysis will be hybridized with HNN in extracting the behavior of the dataset.

Energy Based k Satisfiability Reverse Analysis Method (EkSATRA)
One of the limitations of the standalone HNN in knowledge extraction is the interpretation of the output. Usually, the output of the conventional HNN can be interpreted in terms of bipolar state which requires expensive output checking. Hence, logic mining connects propositional logical rule (HNN-2SAT or HNN-3SAT) with knowledge extraction by implementing ANN as a learning system. The pattern of the dataset can be extracted and represented via logical rule obtained by HNN. This section formulates an improved reverse analysis method, which is energy based k satisfiability reverse analysis method of EkSATRA. EkSATRA is structurally different from the previous kSATRA because only logical rule that comply with Equation (22) will be converted into induced logic. The following algorithm illustrate the implementation of EkSATRA: Step 1: Consider n number of attributes (S 1 , S 2 , S 3 , S 4 , . . . , S n ) of the datasets. Convert all binary dataset into bipolar representation: The state of S i is defined based on the neural network conventions where 1 is considered as TRUE and -1 as FALSE. S i and S j , i = j are the collection of neuron that represent the C 1 are clauses for P 2SAT and P 3SAT respectively. Hence, the collection of C (k) 1 that leads to positive outcome of the learning data or P l i = 1 will be segregated.
Step 2: Calculate the collection of C (k) i that frequently leads to P l i = 1. The optimum logic P best of the dataset is given as follows: Note that, if P best must be in the form of Boolean algebra [44] which corresponds to Equations (2) and (3). Derive the cost function E P best by using Equation (7).
Step 3: Find the state of S i that corresponds to E P best = 0. Hence by comparing E P best with H P best , the synaptic weight of HNN-kSAT will be obtained in [33].
Step 4: By using Equations (9) and (10) i will be based on the following condition: Note that the variable assignment will formulate the induced logic P B i .
Step 5: Calculate the final energy that corresponds to the value of S induced i by using Equations (15) and (16). Verify the energy by using the following condition: The threshold value ∂ is predefined by the user (usually 10 − 4 ). According to Equation (26) only P B i that achieve global minimum energy that will proceed to the testing phase.
Step 6: Construct the induced logic P B i from Equation (26). By using test data from the dataset, verify whether P B i = P test i . Note that, P test i is the test data provided by the user.
The verified induced logic, P B i by the energy function will be extracted at the end of the EkSATRA, indicating the correct P kSAT logical representation of the behavior of the data set. This is different as compared to the existing work in [44] which focuses on the unverified induced logic by the energy function. This innovation is important in deciding the quality of the P B i produced at the end of the retrieval phase.

EkSATRA in E Recruitment Data Set
E recruitment (ER) data is a data set obtained from an insurance agency in Malaysia [47]. The data set contains 155 candidate's information such as age, past occupation, education background, origin, online texting, criminal record, keep in view list, citizenship status, and source of candidate. Previously, the insurance agency attempts to analyze the data by using the statistical approach such as logistic regression. Even though the results were acceptable, the behavior of the ER dataset will remain difficult to observe. Thus, the recruiter requires a comprehensive approach so that the behavior of the data set can be extracted systematically even though a new set of data will be added in the future. Hence, the logic mining approach via kSATRA will provide a solid logical rule as a representation of the ER data set to the recruiter from the insurance agency in Malaysia. This will be utilized to generate a logical rule to represent the behavior of the data.
In this work, there are different attributes being entrenched in 2SATRA and 3SATRA respectively. The aim of ER data is to extract the logical rule that explain the behavior of the candidates. This logical rule will determine their attendance during pre-requisite seminar. ER data will be divided into learning data and testing data. In learning data set, {Attend, Not Attend} will be converted into bipolar representation {1, −1} respectively. Each attribute of the candidate will be represented in terms of neuron in kSATRA. Hence there will be a total of k neurons per clause will be considered in this data set. In this regard, kSATRA contains collection of neuron formula that leads to P l i = Attend(P l i = 1) or P l i = Not Attend(P l i = −1). For example, one of the candidates P l i is able to communicate via online texting (WhatsApp or Facebook Messenger), is aged less than 25 years old, has no past occupation, and an education background higher than SPM (Sijil Pelajaran Malaysia; the Malaysia high school diploma), does not originate from Kota Kinabalu (headquarters of the company) and the resume was sent through email. The P l i has the following neuron interpretation: By converting the above attributes into logical rule, P l i will reads where P l i = 1 for candidate i = 1. In other word, EkSATRA "learned" that the candidate attended P l i = 1 the pre-requisite seminar if they satisfy any of the neuron interpretation in Equation (28). The above steps will be repeated to find the rest of the P l i where i = 1, 2, 3, . . . N. Hence the network will obtain the initial P best and will be embedded to HNN. In order to derive the correct synaptic weight, the network will find the correct interpretation that leads to E P best = 0. During retrieval phase of EkSATRA, HNN will retrieve the induced logic that optimally explain the relationship of the attributes for the candidates. One of the possible induced logical rule P B i is as follows: Equation (29) is the logical rule that generalize the behavior of the whole candidates in HNN. The symbol (←) represents implication of variables that leads to P B i . Similar to 2SAT, the logic extraction method will be applied to 3SAT representation. The information from the logical rule helps the recruitment team to analyze and generalize the performance of the candidate based on simple logical induction. By using the induced logic P B i , recruitment team can classify the attendance of the candidate to pre-requisite seminar. This induced logic P B i will assist the recruitment team by creating more effective strategies to address only significant attribute(s) that reduce the number of absentees for company's event. Less attention will be given to unimportant attributes which will reduce cost and management time. The full implementation of EkSATRA in ER data set extraction is shown as a block diagram in Figure 2.  Figure 2 shows that the EkSATRA can be divided into learning and retrieval phase, before obtaining the logical representation that can be used in explaining the relationship and behaviour of ER data set.

Performance Evaluation Metric
In this section, a total of four performance evaluation indicators are deployed to analyze the effectiveness of our EkSATRA model in extracting important logical rule in ER datasets. Note that, all the proposed metrics evaluate the performance of the learning and testing phase. Since the integrated ANN in the proposed EkSATRA is HNN, the proposed metric solely indicates the performance of the retrieved neuron state that contribute to B i P . During the learning phase, the performance of the kSAT representation that governs the network will be evaluated based on the following fitness equation: NC is the number of clause for any given B i P . According to Equation (30), i C is defined as follows: Note that, as i f approaching to N C , the value of E P kSAT will be minimized to zero. By using the obtained i f , the performance of the learning phase will be evaluated based on root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and computation time (CPU).

Root Mean Square Error
Root mean square error (RMSE) [36] is a standard error estimator that widely been used in predictions and classifications. During the learning phase, RMSE measures the deviation of the error  Figure 2 shows that the EkSATRA can be divided into learning and retrieval phase, before obtaining the logical representation that can be used in explaining the relationship and behaviour of ER data set.

Performance Evaluation Metric
In this section, a total of four performance evaluation indicators are deployed to analyze the effectiveness of our EkSATRA model in extracting important logical rule in ER datasets. Note that, all the proposed metrics evaluate the performance of the learning and testing phase. Since the integrated ANN in the proposed EkSATRA is HNN, the proposed metric solely indicates the performance of the retrieved neuron state that contribute to P B i . During the learning phase, the performance of the kSAT representation that governs the network will be evaluated based on the following fitness equation: NC is the number of clause for any given P B i . According to Equation (30), C i is defined as follows: Note that, as f i approaching to NC, the value of E P kSAT will be minimized to zero. By using the obtained f i , the performance of the learning phase will be evaluated based on root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and computation time (CPU).

Root Mean Square Error
Root mean square error (RMSE) [36] is a standard error estimator that widely been used in predictions and classifications. During the learning phase, RMSE measures the deviation of the error between the current value f i and NC with respect to mean f i . The Learning_RMSE is based on the following equation where f max = NC and f max is dependent to the number of kSAT clauses. During retrieval phase, RMSE measures the deviation of the error between the current P B i with the state of the P test i .
Note that, lower value of Learning_RMSE signifies the compatibility of kSAT in EkSATRA, likewise lower value of Testing_RMSE signifies the small error deviation of the proposed P B i with respect to the P test i .

Mean Absolute Error
The mean absolute error (MAE) [44] is another type of error that evaluate the straightforward difference between the expected value and the current value. During the learning phase, MAE measures the absolute difference between the current value f i and NC. The Learning_MAE is based on the following equation: where f max = NC and f max is dependent to the number of kSAT clauses. During retrieval phase, MAE measures the deviation of the error between the current P B i with the state of the P test i .

TESTI NG_MAE
Note that, lower value of Learning_MAE signifies the compatibility of kSAT in EkSATRA, likewise lower value of Testing_MAE signifies the small error deviation of the proposed P B i with respect to the P test i .

Mean Absolute Percentage Error
Mean absolute percentage error (MAPE) [44] measures the size of the error in form of percentage terms. During the learning phase, MAPE measures the percentage difference between the current value f i and NC. The Learning_MAPE is based on the following equation: where f max = NC and f max is dependent to the number of kSAT clauses. During retrieval phase, MAE measures the deviation of the error between the current P B i with the state of the P test i .
Note that, lower value of Learning_MAPE signifies the compatibility of kSAT in EkSATRA, likewise lower value of Testing_MAPE signifies the small error deviation of the proposed P B i with respect to the P test i . In other word, high value of Learning_MAPE will result to more E P kSAT = 0 which is reported to affect the quality of the retrieval phase. Hence, inconsistent P B i will result in lower accuracy of the proposed logic mining.

CPU Time
CPU time is defined as a time acquired by a model to complete the learning phase and retrieval phase. In the perspective of learning phase, CPU time is calculated from the f i into f max whereas CPU time for retrieval phase is calculated from initial neuron state . Hence, the simple definition for CPU_Time is as follows: CPU_Time has been utilized in several papers [34,40] for examining the complexity of the proposed HNN-kSAT model.

Simulation Setup
The simulation is designed to evaluate the capability of EkSATRA in extracting the relationship between the ER data attributes in terms of optimal P B i . In this study, 60% of the candidate data sets in ERHNN will be used as P l i and 40% will be utilized as P test i of the learning phase of EkSATRA. The learning to testing data ratio, 3:2 is chosen to comply with the work of Kho et al. [44]. All HNN models were implemented in Dev C++ Version 5.11 in Windows 10 (Intel Core i3, 1.7 GHz processor) with different complexities. In order to avoid possible bad sector, the simulation is conducted in a similar device. According to [36], the threshold CPU time for both learning phase and testing phase is set as one day (24 h). If EkSATRA exceeds the proposed threshold CPU time, the P B i will not compared with P test i . Regarding on the neuron variation issue during the retrieval phase, clausal noise (CN) has been added to avoid possible overfitting. The equation relating to NC and CN is as follows: where NC attribute is the candidate's attribute in HNN. In this study, the setting NC attribute = 1 considering the number of clause in each P kSAT corresponds to the value of NC attribute . In practice, NC has a linear relationship to the number of CN and P 3SAT is expected to experience more CN compared P 2SAT . In terms of logical rule that will be embedded inside HNN, the existing work of Sathasivam and Abdullah [41] that implemented HORNSAT in their proposed reverse analysis method. In this study [41], the embedded HORNSAT logical rule must consist at most one positive literal for any proposed clause in the formulation. The proposed HORNSAT embedded in HNN has been improved by the work of Velavan et al. [38]. In this work [38], they proposed the combination of hyperbolic activation function and Boltzmann machine to reduce unnecessary neuron oscillation during the retrieval phase. Both of these proposed models were considered the only existing logic mining in the literature. The existing method were abbreviated as HNN-HORNSAT and HNN-MFTHORNSAT. Table 1 illustrates the parameter setup for HNN-kSAT models: The important parameters such as the neuron combinations, number of learning, number of trial and neuron string should be set as 100 to comply with the work of Kasihmuddin et al. [39]. Neuron combination can be defined as a number of possible input combination input during the simulation. Number of learning is the learning iteration required for the proposed method to achieve E P kSAT = 0 during the learning phase. Number of trials is the number of retrieved P B i for each neuron combination. The optimal neuron combination is essential, as large number of neuron combination will increase the dimension of the searching space of the solution, resulting in the computational burden. In addition, if we set the small neuron combinations, the solution will lead to local minimum solutions. According to [37], hyperbolic tangent activation function (HTAF) was chosen due to the differentiability of the function and the capability to establish non-linear relationship among the neuron connections. Based on ER data set, there is no missing value, indicating that the complete data will be processed by our proposed method.

Result and Discussion
The performance of the simulated program with different complexities for HNN-kSAT models will be evaluated with the existing models HNN-HORNSAT [35] and HNN-MFTHORNSAT [38] in terms of root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), accuracy and CPU time. Figures 3 and 4 illustrate root mean square error (RMSE) and mean absolute error (MAE) of HNN models during learning phase. It is worth noting that this analysis only proposes strictly two or three literals per clause. The data has successfully embedded into the network and forming a learnable kSAT logic. The comparison has been made between the proposed models, HNN-2SAT and HNN-3SAT with the existing methods, namely HNN-HORNSAT [35] and HNN-MFTHORNSAT [38]. As seen in Figure 2, HNN-kSAT with NC = 2 until NC = 8 has the best performance in terms of RMSE compared to HNN-HORNSAT and HNN-MFTHORNSAT. HNN-kSAT utilizes logical inconsistencies help EkSATRA to derive the optimum synaptic weight for HNN. Optimal synaptic weight is a building block for optimum P B i . The RMSE result from Figure 2 has been supported by the value of MAE in Figure 3.  that recorded 8.571425 for HNN-2SAT. As the number of C N increased, learning phase of EkSATRA will be much convoluted because HNN-kSAT is required to find the consistent interpretation for best P . In this case, learning phase of EkSATRA for both HNN-2SAT and HNN-3SAT trapped in trial and error solution that leads to RMSE and MAE accumulation. In contrast, the learning phase in HNN-HORNSAT and HNN-MFTHORNSAT were computationally expensive as more iterations needed leading to higher RMSE and MAE values compared to HNN-2SAT and HNN-3SAT. All in all, EkSATRA contributes in generating the best logic to represent the relationship between each instance and the verdict of HNN. Table 2 manifests the MAPE obtained by HNN-2SAT, HNN-3SAT, HNN-HORNSAT and HNN-MFTHORNSAT during learning HNN. The value of MAPE produced by the four models is always less than 100%. Hence, error produced in every iteration for HNN-2SAT and HNN-3SAT during learning phase will increased as N C increases. However, the MAPE recorded by HNN-HORNSAT and HNN-MFTHORNSAT were to some extent higher than the proposed models. At 7 N C = , the value of MAPE in HNN-3SAT is approximately 55% times bigger than 8 N C = because for the network to converge into full fitness (learning completed), more iterations needed. Therefore, the similar trend can be seen in HNN-2SAT as the complexity increases. Thus, HNN-2SAT and HNN-3SAT work optimally in learning the HNN entrenched to the network before being stored into content addressable memory. The complete learning process will ensure the network to generate the best logic to represent the characteristic of the HNN. Furthermore, it will be deployed during learning by using the remaining 40% of the data entries.   that recorded 8.571425 for HNN-2SAT. As the number of C N increased, learning phase of EkSATRA will be much convoluted because HNN-kSAT is required to find the consistent interpretation for best P . In this case, learning phase of EkSATRA for both HNN-2SAT and HNN-3SAT trapped in trial and error solution that leads to RMSE and MAE accumulation. In contrast, the learning phase in HNN-HORNSAT and HNN-MFTHORNSAT were computationally expensive as more iterations needed leading to higher RMSE and MAE values compared to HNN-2SAT and HNN-3SAT. All in all, EkSATRA contributes in generating the best logic to represent the relationship between each instance and the verdict of HNN. Table 2 manifests the MAPE obtained by HNN-2SAT, HNN-3SAT, HNN-HORNSAT and HNN-MFTHORNSAT during learning HNN. The value of MAPE produced by the four models is always less than 100%. Hence, error produced in every iteration for HNN-2SAT and HNN-3SAT during learning phase will increased as N C increases. However, the MAPE recorded by HNN-HORNSAT and HNN-MFTHORNSAT were to some extent higher than the proposed models. At 7 N C = , the value of MAPE in HNN-3SAT is approximately 55% times bigger than 8 N C = because for the network to converge into full fitness (learning completed), more iterations needed. Therefore, the similar trend can be seen in HNN-2SAT as the complexity increases. Thus, HNN-2SAT and HNN-3SAT work optimally in learning the HNN entrenched to the network before being stored into content addressable memory. The complete learning process will ensure the network to generate the best logic to represent the characteristic of the HNN. Furthermore, it will be deployed during learning by using the remaining 40% of the data entries.  Figure 3, HNN-kSAT with NC = 1 has the best performance in terms of MAE. It can be observed that MAE for NC = 1 is equal to 0.85 compared with NC = 4 that recorded 8.571425 for HNN-2SAT. As the number of CN increased, learning phase of EkSATRA will be much convoluted because HNN-kSAT is required to find the consistent interpretation for P best . In this case, learning phase of EkSATRA for both HNN-2SAT and HNN-3SAT trapped in trial and error solution that leads to RMSE and MAE accumulation. In contrast, the learning phase in HNN-HORNSAT and HNN-MFTHORNSAT were computationally expensive as more iterations needed leading to higher RMSE and MAE values compared to HNN-2SAT and HNN-3SAT. All in all, EkSATRA contributes in generating the best logic to represent the relationship between each instance and the verdict of HNN. Table 2 manifests the MAPE obtained by HNN-2SAT, HNN-3SAT, HNN-HORNSAT and HNN-MFTHORNSAT during learning HNN. The value of MAPE produced by the four models is always less than 100%. Hence, error produced in every iteration for HNN-2SAT and HNN-3SAT during learning phase will increased as NC increases. However, the MAPE recorded by HNN-HORNSAT and HNN-MFTHORNSAT were to some extent higher than the proposed models. At NC = 7, the value of MAPE in HNN-3SAT is approximately 55% times bigger than NC = 8 because for the network to converge into full fitness (learning completed), more iterations needed. Therefore, the similar trend can be seen in HNN-2SAT as the complexity increases. Thus, HNN-2SAT and HNN-3SAT work optimally in learning the HNN entrenched to the network before being stored into content addressable memory. The complete learning process will ensure the network to generate the best logic to represent the characteristic of the HNN. Furthermore, it will be deployed during learning by using the remaining 40% of the data entries.  Table 3 displays the CPU Time results for the HNN models respectively. To assess the robustness of the models in logic mining, CPU time is recorded for the learning and retrieval phase of HNN. According to Table 3, less CPU Time are required to complete one execution of learning and testing for ER when the number of NC deployed is less. As it stands, HNN-2SAT and HNN-3SAT models require substantial amount of time to complete the learning when the complexity is higher. Overall, the HNN remains competent in minimizing the kSAT inconsistencies and compute the global solution within the acceptable CPU time. Hence, the CPU Time for HNN-3SAT is consistently higher than HNN-2SAT due to the more instances need to be processed during the learning and testing phase of HNN. However, the CPU time recorded for the existing methods, HNN-HORNSAT and HNN-MFTHORNSAT were apparently higher due to more iterations needed in generating the best logic for the HNN.  Table 4 shows the respective testing error recorded for both models during testing the HNN. Thus, the testing RMSE, MAE, accuracy and MAPE recorded for HNN-2SAT, HNN-3SAT, HNN-HORNSAT, and HNN-MFTHORNSAT were consistently similar for each of NC = 1 until NC = 8. Hence, this demonstrates the capability of our proposed network, EkSATRA and HNN-kSAT in generating the best logic, P best during the learning phase that contributes to a very minimum error during testing phase. The learning mechanism in EkSATRA in extracting the best logic to map the relationship of the attributes in HNN is acceptable according to performance evaluation metrics recorded during simulation. According to the accuracy recorded by each model, the proposed model achieved 63.30% positive recruitment outcome with HNN-2SAT and 85.00% for HNN-3SAT. According to Table 5, candidate in HNN-3SAT will give a negative result (not attend) if the candidate has the following conditions:

P best
HNN-HORNSAT In another development, candidate in HNN-2SAT will give a negative result (not attend) if the candidate has the following conditions: The existing works on logic mining in HNN (HNN-HORNSAT and HNN-MFTHORN SAT) are unable to achieve at least 60% of the positive recruitment outcome. To sum up, HNN-3SAT incorporated with EkSATRA is the best model in learning and testing HNN due to lower values of RMSE, MAE, MAPE and the highest accuracy for logical rule. Hence, the logical rule obtained from HNN-3SAT will benefit the recruitment team in identifying the relationship between the candidates' attributes and their eligibility to attend the pre-requisite seminar.

Conclusions
In conclusion, the findings have indicated the significant improvement of kSAT representation, logic mining technique and HNN in extracting the behavior of the real data set. Regarding on the non-optimal logical representation in the standard reverse analysis method in [41], quoting [43] that the flexible logical rule will make a tremendous impact in processing the ER data set in more systematic form. In this paper, we have successfully transformed the ER data set into optimal logical representation in the form of kSAT representation to best represent the relationship of ER data set. In addition, we have applied EkSATRA as an alternative approach in extracting the relationships between the attributes correspond to the positive recruitment of ER data set of an insurance company in Malaysia. Collectively, the proposed model, HNN-kSAT has explicitly produced the induce logic from the learned data with better accuracy as compared with HNN-MFTHORNSAT and HNN-HORNSAT. Apart from that, the effectiveness of kSAT in optimally representing the attributes of HNN is due to the simplicity in the structure of the logical representation. Hence, the relationship of the attributes in the ER data set has been extracted fruitfully with lower error evaluations and better accuracy. In order to counter the limitations, this research can be further developed in refining the learning phase of EkSATRA by employing the robust learning algorithms from the swarm-based metaheuristic to the evolutionary searching algorithm. Ultimately, the improved EkSATRA also can be extended to evaluate the retention rate of the insurance agents and in finding the significant factors in elevating the sales production.  Acknowledgments: The authors would like to express special dedication to all of the researchers from AI Research Development Group (AIRDG) for the continuous support especially Alyaa Alway and Nur Ezlin Zamri.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix J
The schematic diagram for HNN models for k = 2 is given as follows: Figure A1. The Schematic Diagram of HNN-kSAT.