Next Article in Journal
Construction of EMD-SVR-QGA Model for Electricity Consumption: Case of University Dormitory
Next Article in Special Issue
Clustering and Dispatching Rule Selection Framework for Batch Scheduling
Previous Article in Journal
Review of Some Control Theory Results on Uniform Stability of Impulsive Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Dynamic Adjusting NGHS-ANN for Predicting the Recidivism Rate of Commuted Prisoners

1
Business School, University of Shanghai for Science and Technology, Shanghai 200093, China
2
Industrial Engineering and Management, National Taipei University of Technology, Taipei 10632, Taiwan
3
Taoyuan Prison, Agency of Corrections, Ministry of Justice, Taoyuan 33056, Taiwan
4
Graduate School of Crime Prevention and Corrections, Central Police University, Taoyuan 33304, Taiwan
*
Author to whom correspondence should be addressed.
Mathematics 2019, 7(12), 1187; https://doi.org/10.3390/math7121187
Submission received: 15 November 2019 / Revised: 29 November 2019 / Accepted: 2 December 2019 / Published: 4 December 2019

Abstract

:
Commutation is a judicial policy that is implemented in most countries. The recidivism rate of commuted prisoners directly affects people’s perceptions and trust of commutation. Hence, if the recidivism rate of a commuted prisoner could be accurately predicted before the person returns to society, the number of reoffences could be reduced; thereby, enhancing trust in the process. Therefore, it is of considerable importance that the recidivism rates of commuted prisoners are accurately predicted. The dynamic adjusting novel global harmony search (DANGHS) algorithm, as proposed in 2018, is an improved algorithm that combines dynamic parameter adjustment strategies and the novel global harmony search (NGHS). The DANGHS algorithm improves the searching ability of the NGHS algorithm by using dynamic adjustment strategies for genetic mutation probability. In this paper, we combined the DANGHS algorithm and an artificial neural network (ANN) into a DANGHS-ANN forecasting system to predict the recidivism rate of commuted prisoners. To verify the prediction performance of the DANGHS-ANN algorithm, we compared the experimental results with five other forecasting systems. The results showed that the proposed DANGHS-ANN algorithm gave more accurate predictions. In addition, the use of the threshold linear posterior decreasing strategy with the DANGHS-ANN forecasting system resulted in more accurate predictions of recidivism. Finally, the metaheuristic algorithm performs better searches with the dynamic parameter adjustment strategy than without it.

1. Introduction

Parole is the temporary and conditional release of a prisoner prior to the completion of their maximum sentence period. Commutation is the substitution of a lesser penalty for that originally given at the time of conviction. However, whether on parole or commutation, if the prisoner reoffends it can cause social disruption. This highlights the need for accurate recidivism predictions for parolees and commutation offenders. Carroll et al. [1] stated that “in Pennsylvania, the parolees with alcohol problems, younger parolees, and those originally convicted of property crimes (rather than assaultive or drug crimes) were more likely to commit new crimes on parole. Offenders with past heroin use were convicted of more serious crimes on parole. Absconding was significantly more predictable for cases with prior convictions, previous parole violations, and miscellaneous negative statements by the institution about the inmate’s personality.” In Williams’ paper [2], they demonstrated that in California, non-sex offenders, drug registrants, offenders with more than one felony conviction, frequently unemployed offenders, offenders with unstable living arrangements, offenders aged 25 to 30, previous parole violators, and unmarried offenders were more likely to abscond without leave. MacKenzie and Spencer [3] showed that in northern Virginia, offenders committed more crimes when they had high-risk behaviors such as using drugs, alcohol abuse, and carrying a gun; conversely, they committed fewer crimes when they were employed or lived with spouses. In Benda’s paper [4], the findings indicated that caregiving factors have an inverse relationship with the rate of recidivism. Therefore, low self-control, drug use and sales, gang membership, peer association with criminals, carrying weapons, and poor social skills have a positive relationship with recidivism rates. Trulson et al. [5] stated that “generally, males, those younger at first contact with the juvenile justice system, those with a greater number of felony adjudications, gang members, institutional dangers, those in poverty, and those with mental health issues were significantly more likely to recidivate.” Previous studies have focused on qualitative research or statistical analysis. However, over the past two decades, many artificial intelligence methods, such as the artificial neural network (ANN) [6,7], support vector machine (SVM) [8], association rule (AR) [9], etc., have been developed and applied in many problems. Among these methods, the artificial neural network has been widely used to solve numerous types of forecasting problems and could derive a better accuracy than SVM in supervised learning [8,10]. Therefore, this paper adopted ANN as its forecasting tool.
An ANN is a computational mechanism that is inspired by the human brain. A typical ANN structure, also known as a multilayer perceptron (MLP), contains a number of layers each composed of several basic components. The first layer is called input layer; the last layer is called output layer, and the other layers are hidden layers [11]. There are two types of basic components in a typical neural-network structure, namely, neurons and the links between them, as shown in Figure 1. The n i neurons are the processing elements, and the links are the interconnections. Every link has a corresponding w j weight parameter or b i bias parameter. When a neuron receives stimuli from other neurons via the links, it processes the information and produces an output. There are three kinds of neurons, and as per the layers, they are categorized as input, hidden, and output. Input neurons receive stimuli from outside the network. Hidden neurons receive stimuli from neurons at the front of the network and relay the output to neurons at the back of the network. Output neurons transfer the output externally [12].
In the ANN system, the weights were updated by a systemic algorithm during the learning process. The backpropagation (BP) learning algorithm is the most popular procedure for training an ANN [13]. Initially, the weights and biases are assigned randomly. The BP uses gradient descent to search the point(s) with minimum error on an error surface (error as a function of the ANN weights and biases). In other words, the weights and biases are updated using the error, which is calculated with the output and actual data. Once the neural network training was completed, we could predict or classify new data using the calculation with the received stimuli (the new input data), the weights, and the biases. However, the gradient descent, i.e., the learning process of the backpropagation network (BPN), is easy to trap within the local optimum.
To eradicate the aforementioned disadvantage of BPN, combinations of different metaheuristic algorithms and ANN (metaheuristic-ANN) are used, as presented in many studies. Kattan and Abdullah [14] trained the ANNs for pattern-classification problems using the harmony search (HS) algorithm. Tavakoli et al. [11] used the HS-ANN, novel global harmony search ANN (NGHS-ANN), and intelligent global harmony search ANN (IGHS-ANN) for three well-known classification problems. Kumaran and Ravi [15] combined the ANN and the HS algorithm for forecasting long-term, sector-wise electrical energy use. Göçken et al. [16] integrated metaheuristic and ANN algorithms for improved stock price prediction. In these papers, the results showed that metaheuristic-ANN has a better forecasting ability than BPN. Moreover, in the past two decades, HS and varied HS algorithms have been widely proposed, discussed, and applied to many studies. Therefore, we combined different HS and ANN algorithms in this paper.
Metaheuristic-ANN involves a trial solution of metaheuristic algorithms with set weights and biases. Metaheuristic-ANN used a random search mechanism to determine the best weights and biases to minimize forecasting error, such as the mean squared error (MSE), the error rate, and so on. Figure 2 shows a small sample presenting the relationship between a metaheuristic algorithm and ANN. There are two input neurons, three hidden neurons, and one output neuron in Figure 2. Figure 3 shows the procedure of Metaheuristic-ANN. The MSE and error rates are given in Equations (1) and (2).
M S E = i = 1 D a t a s j = 1 O u t p u t s ( z i , j t i , j ) 2 D a t a s
E r r o r   R a t e = N o .   o f   t o t a l   d a t a   N o .   o f   c o r r e c t   f o r e c a s t s   N o .   o f   t o t a l   d a t a
A dynamic adjusting novel global harmony search (DANGHS) algorithm [17], as proposed in 2018, is a novel metaheuristic algorithm that combines a novel global harmony search (NGHS) [18,19,20] with dynamic adjusting parameter strategies. In NGHS, the value of the genetic mutation probability ( p m ) is a fixed given value. However, the searching ability of a metaheuristic algorithm can be improved by the appropriate parameters; the importance of which has been described in many studies [11,18,19,20,21,22]. Therefore, in DANGHS, the genetic mutation probability is dynamically adjusted for each iteration. Chiu et al. [17] found that the DANGHS algorithm is more efficient and effective than other HS algorithms. However, in their paper, they used the DANGHS algorithm to solve 14 benchmark continuous optimization problems only. In this paper, we would like to investigate the searching performance of DANGHS algorithm further. Therefore, a DANGHS-ANN recidivism forecasting system was proposed for the purposes of this paper. According to the numerical results, the DANGHS-ANN system provided more accurate forecasts than the five other systems (BPN, HS-ANN, IHS-ANN, SGHS-ANN, and NGHS-ANN).
The remainder of this paper is divided into three sections. Section 2 introduces the harmony search (HS), improved harmony search (IHS), self-adaptive global best harmony search (SGHS), novel global harmony search (NGHS), and dynamic adjusting novel global harmony search (DANGHS) algorithms. Section 3 discusses the experiments carried out to test and compare the performances of the six forecasting systems. Conclusions and suggestions for future research are provided in Section 4.

2. A Review of Five Harmony Search Algorithms

In this section, HS, IHS, SGHS, NGHS, and DANGHS are reviewed.

2.1. Harmony Search Algorithm

Geem, Kim, and Loganathan [23] first proposed the HS algorithm in 2001. In concept, HS is similar to other metaheuristic algorithms such as genetic algorithm (GA), particle swarm optimization (PSO), and ant colony optimization (ACO). These algorithms combine the rules of randomness to imitate the processes that inspired them. The HS algorithm draws its inspiration from the improvisation process of musicians, such as a jazz trio, and not from biological or physical processes [11,14].
In musical improvisation, musicians play pitches within a set range, and then combine and order them to form a harmony. If the harmony is pleasant, it is stored in each musician’s memory thereby increasing the possibility of performing it again in the future [24]. Similarly, in engineering optimization, each decision variable initially selects a value within a given, feasible range, and then combines all the variables together to give a single solution vector [20]. In the HS algorithm, the trial solution (harmony) for the problem is be comprised of several decision variable values (pitches). Thus, a pleasing harmony means a good trial solution to the problem [11]. If all the decision variable values compose a good trial solution, then the decision variable values are stored in each variable’s memory. Therefore, the possibility of generating a good solution in the future is increased [20]. Figure 4 shows a comparison between music improvisation and engineering optimization. In Figure 4, each musician of the jazz trio plays an instrument simultaneously to form a harmony. The pitch of the piano represents the value of the decision variable 1.
The HS algorithm consists of several parameters. These parameters are the harmony memory size (m), the harmony memory considering rate (HMCR), the pitch adjusting rate (PAR), the bandwidth (BW), and the maximum number of iterations (NI). Among these parameters, the HMCR, PAR, and BW are particularly important because the HS generates a new trial solution from harmony memory (HM) or random selection according to the HMCR, and then the HS adjusts the new trial solution using the PAR and BW. The whole search process of HS algorithm can be described as the following steps.
  • Step 1: Determine the problem and initial algorithm parameters, including m, HMCR, PAR, BW, current iteration k = 1, and NI.
  • Step 2: Generate the initial solutions (harmony memory) randomly and calculate the fitness of each solution.
  • Step 3: Generate a trial solution by HMCR, PAR, and BW. The pseudocode of HS algorithm is shown in Algorithm 1.
Algorithm 1 The Pseudocode of HS
1:    For   j = 1   to   D   do
2:      If   r 1 HMCR   then
3:        x j k + 1 = x ij k
4:        If   r 2 PAR   then
5:          x j k + 1 = x j k + 1 BW + r 3 × 2 × BW
6:          If   x j k + 1 > x jU   then
7:            x j k + 1 = x jU
8:          Else   if   x j k + 1 < x jL   then
9:             x j k + 1 = x jL
10:          End
11:        End
12:      Else
13:        x j k + 1 = x jL + r 4 × ( x jU x jL )
14:      End
15:    End
Here, D represents the number of problem dimensions. r 1 , r 2 , r 3 and r 4 represent the random numbers in the region of [0, 1]. x i j k ( i = 1 , 2 , , m ; j = 1 , 2 , , D ) represents the jth component of the ith solution in current iteration k . x j L represents the lower bound for decision variables x j , and x j U represents the upper bound.
  • Step 4: If the trial solution is better than the worst solution in the HM, replace the worst solution by the trial solution.
  • Step 5: If the maximum number of iterations NI is satisfied, return the best solution in the HM; otherwise, the current iteration k = k + 1 and go back to step 3.

2.2. Improved Harmony Search Algorithm

Mahdavi, Fesanghary, and Damangir [25] presented the IHS algorithm in 2007 for solving optimization problems. The main difference between IHS and the traditional HS method is that two key parameters are adjusted in each iteration, demonstrated by Equations (3) and (4). These parameters are PAR and BW. In their paper, they state that PAR and BW are important parameters to search decision variable values. These two parameters can potentially be useful in speeding up the convergence rate of the HS to the optimal solution. Therefore, the fine adjustment of these parameters is of particular interest.
P A R k = P A R m i n + ( P A R m a x P A R m i n ) N I × k
B W k = B W m a x × e ( ln ( B W m i n B W m a x ) × k / N I )
In Equation (3), P A R k represents the pitch adjustment rates in the current iteration k ; P A R m i n is the minimum adjustment rates, and P A R m a x is the maximum adjustment rates. In Equation (4), B W k is the distance bandwidth in current iteration k ; B W m i n is the minimum bandwidth, and B W m a x is the maximum bandwidth. Figure 5 shows that the PAR and BW values vary dynamically with the iteration number.

2.3. Self-Adaptive Global Best Harmony Search Algorithm

In 2010, Pan et al. [21] introduced the SGHS algorithm for continuous optimization problems. The main difference between the SGHS and traditional HS method is that HMCR and PAR are dynamically adjusted using a normal distribution and BW is altered for each iteration.
In each iteration k , SGHS generated the value of H M C R k by the mean HMCR ( H M C R m ) and its standard deviation. Similarly, the value of P A R k was calculated by the mean PAR ( P A R m ) and its standard deviation. In their paper, the H M C R m is in the range of [0.9, 1.0] and the standard deviation of HMCR is 0.01; the P A R m is in the range of [0.0, 1.0] and the standard deviation of PAR is 0.05.
Furthermore, when the generated harmony was better than the worst harmony in HM, SGHS recorded the H M C R k and P A R k . After a specified learning period (LP), SGHS recalculated the H M C R m by averaging all the recorded H M C R k values during the learning period. In the same way, P A R m was recalculated by averaging all the recorded P A R k values. In subsequent iterations, SGHS generated new H M C R k and P A R k values by the new H M C R m , P A R m , and the given standard deviation.
In addition, B W k was decreased in the first half iterations in Equation (5), and then B W k was a fixed value ( B W m i n ) in the second half iterations.
B W k = { B W m a x B W m a x B W m i n N I × 2 k       i f   k < N I / 2 , B W m i n                          i f   k N I / 2 ,
The whole search process of SGHS algorithm can be described as the following steps.
  • Step 1: Determine the problem and initial algorithm parameters, including m, H M C R m ,   P A R m , B W m a x , B W m i n , LP, current iteration k = 1, and NI.
  • Step 2: Generate the initial solutions (harmony memory) randomly and calculate the fitness of each solution.
  • Step 3: Generate the algorithm parameters in current iteration k , including H M C R k , P A R k , and B W k .
  • Step 4: Generate a trial solution by H M C R k , P A R k , and B W k . The pseudocode of SGHS algorithm is shown in Algorithm 2.
Algorithm 2 The Pseudocode of SGHS [21]
1:   For   j = 1   to   D   do
2:      If   r 1 HMCR k   then
3:        x j k + 1 = x ij k BW k + r 2 × 2 × BW k
4:        If   x j k + 1 > x jU   then
5:          x j k + 1 = x jU
6:        Else   if   x j k + 1 < x jL   then
7:          x j k + 1 = x jL
8:        End
9:        If   r 3 P A R k   then
10:          x j k + 1 = x best , j k
11:        End
12:      Else
13:        x j k + 1 = x jL + r 4 × ( x jU x jL )
14:      End
15:    End
Here, x b e s t , j k represents the jth component of the best solution in current iteration k .
  • Step 5: If the trial solution is better than the worst solution in the HM, replace the worst solution by the trial solution and record the values of HMCR and PAR in current iteration k .
  • Step 6: Recalculate the H M C R m and P A R m .
  • Step 7: If the maximum number of iterations NI is satisfied, return the best solution in the HM; otherwise, the current iteration k = k + 1 and go back to Step 3.

2.4. Novel Global Harmony Search Algorithm

Zou et al. proposed the NGHS algorithm in 2010 for task assignment problems [18], continuous optimization problems [19], and unconstrained problems [20]. The NGHS algorithm is an improved algorithm that combines HS, PSO [26,27,28,29], and GA [30,31,32]. A prominent characteristic of PSO is that individual particles attempt to imitate the social experience. This means that the particles are affected by other better particles in the PSO algorithm. A prominent characteristic of GA is that it is possible for the trial solution to escape from the local optimum by mutation. In other words, NGHS tries to generate a new trial solution by moving the worst solution toward the best solution or by mutation. In addition, HMCR, PAR, and the BW are excluded from NGHS, while the genetic mutation probability ( p m ) is included. Moreover, NGHS replaces the worst solution in HM with a new solution, even if the new solution is worse than the worst solution. The above three characteristics are the key differences between HS and NGHS algorithms. The whole search process of NGHS algorithm can be described as the following steps.
  • Step 1: Determine the problem and initial algorithm parameters, including m, p m , current iteration k = 1, and NI.
  • Step 2: Generate the initial solutions (harmony memory) randomly and calculate the fitness of each solution.
  • Step 3: Generate a trial solution by p m . The pseudocode of NGHS algorithm is shown in Algorithm 3.
Algorithm 3 The Pseudocode of NGHS [18,19,20]
1:    For   j = 1   to   D   do
2:      x R = 2 × x best , j k x worst , j k
3:      If   x R > x jU   then
4:        x R = x jU
5:      Else   if   x R < x jL   then
6:        x R = x jL
7:      End
8:      x j k + 1 = x worst , j k + r 1 × ( x R x worst , j k )  
9:      If   r 2 p m   then
10:        x j k + 1 = x jL + r 3 × ( x jU x jL )
11:      End
12:    End
Here, x R represent the trust region. x w o r s t , j k represents the jth component of the worst solution in current iteration k .
  • Step 4: Replace the worst solution by the trial solution, even if the trial solution is worse than the worst solution.
  • Step 5: If the maximum number of iterations NI is satisfied, return the best solution in the HM; otherwise, the current iteration k = k + 1 and go back to Step 3.

2.5. Dynamic Adjusting Novel Global Harmony Search

Chiu et al. [17] firstly proposed the DANGHS algorithm for continuous optimization problems. As mentioned above, the main difference between DANGHS and NGHS is that the parameter, mutation probability ( p m ), is dynamically adjusted in each iteration by the adjustment strategy. However, Chiu et al. pointed out that the mutation probability can be adjusted using different strategies. Hence, there are 16 different strategies investigated in their paper. All 16 strategies are shown in Table 1, and Figure 6, Figure 7 and Figure 8 are used to illustrate them. In Table 1, p m k is the mutation probability in the current iteration k , p m _ m i n is the minimum genetic mutation probability, p m _ m a x is the maximum genetic mutation probability, m r is the modification rate, and c c is the coefficient of cycle.

3. Experiments and Analysis

3.1. Data Setting

The investigation samples were provided by the Information Department of the Taiwan Ministry of Justice. The samples are criminal tracing records established over three years, from July 16, 2007 to July 15, 2010. The data is solely used for academic research on predicting recidivism. In order to ensure personal privacy, the samples were preprocessed (deidentification).
The total number of samples collected for this paper was 9498. Of the samples, 8569 were male and 929 were female. For the purposes of this paper, the definition of recidivism was “having a record of prosecution.” Of the samples collected, 5408 (56.94%) were recidivists and 4090 (43.06%) were non-recidivists. The input and output variables that were used are shown in Table 2.
Of the original 9498 samples, some were found to have the same combination of input variables, but had different output variables, as shown in Figure 9a. Such samples make it impossible to accurately train the ANN. Therefore, in this paper, we used a statistical method to recalculate the output variable as the recidivism rate, as shown in Figure 9b. After recalculation, the total number of samples was 6825. A recidivism rate above or equal to 0.5 was defined as high, and a rate below 0.5 was defined as low, as shown in the last column of Table 2.
Lastly, proper data representation plays an important role in the design of a successful ANN [33]. Therefore, the output variable was categorized according to two binary numbers which represented the rate of recidivism where 10 represented a low rate, 01 a high rate, and all data in between 0.1 and 0.9 were standardized. Hence, there are 12 and 2 neurons in the input and output layers, respectively.

3.2. Computing Environment Settings

We used Microsoft Visual Studio 2010 C# (64-bit) as the compiler for writing the program to find the solution. The solution-finding equipment comprised an Intel Core (TM) i7-4720HQ (2.6 GHz) CPU, 8 GB of memory, and Windows 10 home edition (64-bit) OS.

3.3. Experimental Structure and Results of BPN

In the BPN experiments, 65% of the samples were used for training, 25% for validation, and 10% for testing [34]. However, the performance of the forecasting system could be affected by the quality of the training samples. To prevent this, previously conducted studies have used K-fold cross validation. In this paper, we also used K-fold cross validation (K = 10) for repeated experiments to verify the robustness of the experimental results under different training samples. In each K-fold experiment, 30 independent experiments were carried out. The total number of training iterations (epochs) is 10,000. The number of hidden layers used is one, and the number of hidden neurons (NHN) is set at five in Equation (6) [33]. In Equation (6), N i n represents the number of input neurons, and N o u t represents the number of output neurons. Therefore, the total number of weights and biases are 77, and the example-to-weight ratio (EWR) is 88.64, as calculated in Equation (7). Dowla and Rogers [35] and Haykin [36] found that the EWR needs to be larger than 10. Hence, in this paper, the ANN is of a reasonable and acceptable structure. Figure 10 shows the MLP structure used in this paper.
N H N = N i n × N o u t
E W R = N o .   o f   t o t a l   d a t a N o .   o f   t o t a l   w e i g h t s   a n d   b i a s e s
For the purposes of this study, the investigation is a classification problem, in which the forecasting output is either of a high (10), or low (01), recidivism rate. Therefore, the prediction error used is the error rate, as given in Equation (2).
Lastly, we used the trial and error method to decide the best learning rate (η) and momentum (α) of BPN. We define three kinds of learning rate, which are 0.1, 0.2, and 0.3. However, Zupan and Gasteiger suggest the learning rate and the momentum are equal to 1 [37]. Therefore, the three kinds of momentum are 0.9, 0.8, and 0.7. The trial and error experimental results are shown in Table 3. In Table 3, LR represents the learning rate, MT represents the momentum, Std represents the standard deviation, and each value represents the mean error rate of 30 independent experiments. For example, 3.37 × 10−1 represents the mean error rate of 30 independent experiments where LR is set to 0.1, MT is set to 0.9, and K is set to 1. Based on the mean (3.25 × 10−1, 3.53 × 10−1, and 3.52 × 10−1) and the p-value, we can observe that the error rate (where the learning rate is set to 0.1 and the momentum is set to 0.9) is significantly smaller than the other two parameter combinations in the training, validation, and testing datasets.

3.4. Experimental Results of the Metaheuristic-ANN Forecasting System

We combined the DANGHS algorithm with ANN (DANGHS-ANN) to solve the recidivism rate prediction problem. In order to verify the performance of the DANGHS-ANN forecasting system, we compared the extensive experimental results of DANGHS-ANN with five other systems, including various HS-ANNs and one BPN. We referred to previous references [17,20,21,22] and used the trial and error method to decide the parameters of different HS algorithms. The parameters of the compared HS algorithms are shown in Table 4. In each algorithm, in each K-fold experiment, thirty independent experiments (n) were carried out with 10,000 iterations. There was no overtraining detected in the metaheuristic-ANN forecasting system. Therefore, we combined the training and validation datasets into the learning dataset. This means that 90% of the samples were used for learning and 10% for testing. The experimental results from the learning and testing datasets, obtained using the 16 different adjustment strategies in the DANGHS-ANN forecasting system, are shown in Table 5 and Table 6. The experimental results from the learning and testing datasets, as obtained using the six different forecasting systems, are shown in Table 7 and Table 8. Figure 11 presents a typical solution history graph of the six different forecasting systems along with the various iterations. Table 9, Table 10, Table 11, Table 12, Table 13 and Table 14 present the comparison between actual and predictive recidivism for the six forecasting systems.
In Table 5, several experimental results are given. Firstly, in the learning dataset where K = 1, the error rate (3.1190 × 10−1) with threshold linear posterior decreasing strategy (Threshold_4) is lower than those of the other strategies. Secondly, the mean of error rate (3.1132 × 10−1) with Threshold_4 is the lowest of all strategies. Thirdly, the standard deviation (1.2896 × 10−6) where Threshold_4 is lower than those of the other strategies. Lastly, according to the p-value, the error rate for Threshold_4 is significantly lower than those of the other strategies, except for the straight linear decreasing strategy (Straight_2).
In Table 6, several experimental results are given. Firstly, in the testing dataset where K = 1, the error rate (3.3934 × 10−1) with the threshold linear posterior decreasing strategy (Threshold_4) is lower than those of the other strategies. Secondly, the mean of error rate (3.3878 × 10−1) with Threshold_4 is the lowest of all the strategies. Thirdly, the standard deviation (3.2137 × 10−5) with the threshold linear prior increasing strategy (Threshold_1) is lower than those of the other strategies. However, according to the p-value, the error rate with Threshold_4 is significantly lower than those of the other strategies, except for strategy Straight_2. Based on the experimental results in Table 5 and Table 6, the best strategy for the DANGHS-ANN forecasting system, and specifically for the recidivism prediction problem, is the threshold linear posterior decreasing strategy.
In Table 7, several experimental results are given. Firstly, in the learning dataset, the DANGHS-ANN error rates are the lowest of all the other forecasting systems, for all K-fold experiments. Additionally, the DANGHS-ANN mean of error rate (3.1132 × 10−1) and standard deviation (1.1356 × 10−3) are the lowest of all the forecasting systems. This means that the forecasting ability of the DANGHS-ANN system is relatively robust. Also, according to the p-value, the DANGHS-ANN error rate is significantly lower than the other systems in the learning dataset. The mean of error rate of the IHS-ANN system (3.1315 × 10−1) is lower than that of the HS-ANN system (3.1357 × 10−1), and the mean of error rate of the DANGHS-ANN system (3.1132 × 10−1) is lower than that of the NGHS-ANN system (3.1545 × 10−1).
In Table 8, several experimental results are given. Firstly, in the testing dataset, most of the DANGHS-ANN error rates are smaller than those of the other forecasting systems, except where K = 2 and K = 7. Where K = 2 and K = 7, the IHS-ANN system has the lowest error rate (3.3982 × 10−1 and 3.4056 × 10−1). Secondly, the DANGHS-ANN mean of error rate (3.3878 × 10−1) is lower than those of the other forecasting systems. Thirdly, the SGHS-ANN standard deviation (5.7819 × 10−3) is the lowest of all the forecasting systems. This means that the forecasting ability of the SGHS-ANN system is relatively robust. However, according to the p-value, the error rate of the DANGHS-ANN system is significantly lower than the other systems in the testing dataset. Lastly, the IHS-ANN mean of error rate (3.4039 × 10−1) is lower than that of the HS-ANN system (3.4126 × 10−1), and the DANGHS-ANN mean of error rate (3.3878 × 10−1) is lower than that of the NGHS-ANN system (3.4421 × 10−1).
In Figure 11, four experimental results are given. Firstly, the BPN system graph displays a horizontal line, this means that it would easily fall within the local optimum, and unlikely escape it. According to the SGHS-ANN and NGHS-ANN graphs, they also easily fall within the local optimum in the latter iterations. However, the HS-ANN, IHS-ANN, and DANGHS-ANN graphs show that these three systems continuously decrease the error rate as the iterations progress. Moreover, according to the NGHS-ANN and DANGHS-ANN graphs, we find that the DANGHS-ANN error rate is lower than that of the NGHS-ANN system. In other words, the DANGHS-ANN system has a better forecasting ability than the NGHS-ANN system.
Finally, we analyze and discuss the comparison between actual and predictive recidivism. In Table 9, the number of predictions of low recidivism by the BPN system is 2652, of which the actual number of high recidivisms is 886, and the rate of high recidivism is 33.41%. In Table 10, the number of predictions of low recidivism by the HS-ANN system is 2596, of which the actual number of high recidivisms is 841, and the rate of high recidivism is 32.40%. In Table 11, the number of predictions of low recidivism by the IHS-ANN system is 2397, of which the actual number of high recidivisms is 767, and the rate of high recidivism is 32.00%. In Table 12, the number of predictions of low recidivism by the SGHS-ANN system is 2807, of which the actual number of high recidivisms is 931, and the rate of high recidivism is 33.17%. In Table 13, the number of predictions of low recidivism by the NGHS-ANN system is 2401, of which the actual number of high recidivisms is 792, and the rate of high recidivism is 32.99%. Lastly, in Table 14, the number of predictions of low recidivism by the DANGHS-ANN system is 2508, of which the actual number of high recidivisms is 798, and the rate of high recidivism is 31.82%.
Particular attention is paid to the decisions made for managing the prisoners. For example, the management decision for prisoners predicted to fall within the high recidivism category is to “continue serving their sentences.” Therefore, this kind of decision will positively affect the safety of the public. Conversely, prisoners that are predicted to fall within the low recidivism category will re-enter society by commutation or parole. Their behavior thereafter could directly affect the safety of the public. The purpose of this paper is to develop a recidivism rate forecasting system to reduce the associated reoffences and improve public acceptance of commutation and parole policies. According to the above argument, the type I error of the traditional presumption of innocence principle is not applicable to this paper. Therefore, in Table 9, Table 10, Table 11, Table 12, Table 13 and Table 14, in the BPN forecasting system the prisoners which are predicted to fall into the low recidivism category had the highest actual recidivism rate (33.41%) amongst all forecasting systems. However, with the DANGHS-ANN forecasting system, the prisoners which were predicted to fall into the low recidivism rate category had the lowest actual recidivism rate (31.82%).
According to the above experimental results, the DANGHS-ANN system was the most accurate in comparison with BPN and the four other HS-related ANN forecasting systems used for the prediction of recidivism rates of commuted prisoners.

4. Conclusions and Future Research

We presented a combined DANGHS-ANN forecasting system. Extensive experiments and comparisons were carried out to solve the problem of accurately predicting recidivism rates for commuted prisoners. The experimental results provide several findings that are worth noting.
Firstly, using the threshold linear posterior decreasing strategy with the DANGHS-ANN forecasting system yielded the best results for recidivism prediction. Additionally, according to the p-value and convergence graph, DANGHS-ANN outperformed the five other forecasting systems (BPN, HS-ANN, IHS-ANN, SGHS-ANN, and NGHS-ANN).
Secondly, it was found that using metaheuristic-ANN with the dynamic parameter adjustment strategy, such as DANGHS-ANN and IHS-ANN, could result in improved forecasting performance as opposed to using systems without the dynamic parameter adjustment strategy, such as NGHS and the HS. In other words, using a parameter adjustment strategy improves the prediction ability of the NGHS-ANN forecasting system. This conclusion is aligned with the study by Chiu et al. [17].
Finally, by comparing and analyzing actual versus predicted recidivism, it is shown that the DANGHS-ANN model proposed in this paper could reduce the number of prisoners with a high recidivism rate being granted commutation or parole. Thereby, the proposed forecasting system could reduce the occurrence of recidivism and improve the public’s opinion of the commutation and parole policy. In conclusion, DANGHS-ANN is the more accurate and acceptable forecasting system.
It is worth noting that we still have many factors that have not been considered in this paper. Thus, we could further explore these factors in the future. First, we investigated the performance of proposed forecasting system DANGHS-ANN in the fixed structure of MLP. In the future, we could investigate the robustness of the DANGHS-ANN in different structures of MLP or in the input data sets with noise. Second, in this paper, we used BPN and several HS-ANNs to forecast the recidivism rate. However, there are many metaheuristic algorithms and forecasting methods, such as differential evolution (DE), PSO, AR, SVM, etc. In the future, we could use different artificial intelligence methods to solve this problem. Third, we investigated the performance of the dynamic adjusting parameter mechanism in the DANGHS algorithm only. In the future, we could investigate the performance of combining different dynamic adjusting parameter mechanisms into other metaheuristic algorithms, such as teaching learning based optimization (TLBO) [38], modified coyote optimization algorithm (MCOA) [39], Harris hawks optimization (HHO) [40], etc.

Author Contributions

Conceptualization, P.-C.S. and C.-Y.C.; methodology, P.-C.S.; software, P.-C.S.; validation, P.-C.S.; formal analysis, P.-C.S.; resources, C.-H.C.; data curation, C.-H.C.; writing—original draft preparation, P.-C.S.; writing—review and editing, P.-C.S.; visualization, P.-C.S.; supervision, C.-Y.C.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank all the reviewers for their constructive comments and Editage (www.editage.cn) for English language editing.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

ACOant colony optimization
ANNartificial neural network
ARassociation rule
BPbackpropagation
BPNbackpropagation network
BWbandwidth
c c coefficient of cycle
DANGHSdynamic adjusting novel global harmony search
DEdifferential evolution
EWRexample-to-weight ratio
GAgenetic algorithm
HHOHarris hawks optimization
HMharmony memory
HMCRharmony memory considering rate
HSharmony search
IGHSintelligent global harmony search
IHSimproved harmony search
k current iteration
KK-fold cross validation
LPlearning period
mharmony memory size
MCOAmodified coyote optimization algorithm
MLPmultilayer perceptron
MSEmean squared error
ntotal number of experiments
NGHSnovel global harmony search
NHNnumber of hidden neurons
NImaximum number of iterations
N i n number of input neurons
N o u t number of output neurons
PARpitch adjusting rate
p m genetic mutation probability
PSOparticle swarm optimization
SGHSself-adaptive global best harmony search
SVMsupport vector machine
TLBOteaching learning based optimization

References

  1. Carroll, J.S.; Wiener, R.L.; Coates, D.; Galegher, J.; Alibrio, J.J. Evaluation, Diagnosis and Prediction in Parole Decision Making. Law Soc. Rev. 1982, 17, 199–228. [Google Scholar] [CrossRef]
  2. Williams, F.P.; McShane, M.D.; Dolny, H.M. Predicting Parole Absconders. Prison J. 2000, 80, 24–38. [Google Scholar] [CrossRef]
  3. MacKenzie, D.L.; Spencer, D.L. The Impact of Formal & Social Controls on the Criminal Activities of Probationers. J. Res. Crime Delinq. 2002, 39, 243–276. [Google Scholar]
  4. Benda, B.B. Survival Analysis of Criminal Recidivism of Boot Camp Graduates Using Elements From General & Developmental Explanatory Models. Int. J. Offender Ther. 2003, 47, 89–110. [Google Scholar]
  5. Trulson, C.R.; Marquart, J.W.; Mullings, J.L.; Caeti, T.J. In Between Adolescence and Adulthood: Recidivism Outcomes of a Cohort of State Delinquents. Youth Violence Juv. Justice 2005, 3, 355–387. [Google Scholar] [CrossRef]
  6. Hwang, J.-I.; Jung, H.-S. Automatic Ship Detection Using the Artificial Neural Network and Support Vector Machine from X-Band Sar Satellite Images. Remote Sens. 2018, 10, 1799. [Google Scholar] [CrossRef] [Green Version]
  7. Gemitzi, A.; Lakshmi, V. Estimating Groundwater Abstractions at the Aquifer Scale Using GRACE Observations. Geosci. J. 2018, 8, 419. [Google Scholar] [CrossRef] [Green Version]
  8. Liu, M.-L.; Tien, F.-C. Reclaim wafer defect classification using SVM. In Proceedings of the Asia Pacific Industrial Engineering & Management Systems Conference, Taipei, Taiwan, 7–10 December 2016. [Google Scholar]
  9. Moodley, R.; Chiclana, F.; Caraffini, F.; Carter, J. A product-centric data mining algorithm for targeted promotions. J. Retail. Consum. Serv. 2019, 3, 101940. [Google Scholar] [CrossRef]
  10. Tien, F.-C.; Sun, T.-H.; Liu, M.-L. Reclaim Wafer Defect Classification Using Backpropagation Neural Networks. In Proceedings of the International Congress on Recent Development in Engineering and Technology, Kuala Lumpur, Malaysia, 22–24 August 2016. [Google Scholar]
  11. Tavakoli, S.; Valian, E.; Mohanna, S. Feedforward neural network training using intelligent global harmony search. Evol. Syst. 2012, 3, 125–131. [Google Scholar] [CrossRef]
  12. Zhang, Q.-J.; Gupta, K.C.; Devabhaktuni, V.K. Artificial Neural Networks for RF and Microwave Design—From Theory to Practice. IEEE Trans. Microw. Theory Thch. 2003, 51, 1339–1350. [Google Scholar] [CrossRef]
  13. Basheer, I.A.; Hajmeer, M. Artificial neural networks: Fundamentals, computing, design, and application. J. Microbiol. Methods 2000, 43, 3–31. [Google Scholar] [CrossRef]
  14. Kattan, A.; Abdullah, R. Training of feed-forward neural networks for pattern-classification applications using music inspired algorithm. Int. J. Comput. Sci. Inf. Secur. 2011, 9, 44–57. [Google Scholar]
  15. Kumaran, J.; Ravi, G. Long-term Sector-wise Electrical Energy Forecasting Using Artificial Neural Network and Biogeography-based Optimization. Electr. Power Compon. Syst. 2015, 43, 1225–1235. [Google Scholar] [CrossRef]
  16. Göçken, M.; Özçalıcı, M.; Boru, A.; Dosdogru, A.T. Integrating metaheuristics and Artificial Neural Networks for improved stock price prediction. Expert Syst. Appl. 2016, 44, 320–331. [Google Scholar] [CrossRef]
  17. Chiu, C.-Y.; Shih, P.-C.; Li, X. A Dynamic Adjusting Novel Global Harmony Search for Continuous Optimization Problems. Symmetry 2018, 10, 337. [Google Scholar] [CrossRef] [Green Version]
  18. Zou, D.; Gao, L.; Li, S.; Wu, J.; Wang, X. A novel global harmony search algorithm for task assignment problem. J. Syst. Softw. 2010, 83, 1678–1688. [Google Scholar] [CrossRef]
  19. Zou, D.; Gao, L.; Wu, J.; Li, S.; Li, Y. A novel global harmony search algorithm for reliability problems. Comput. Ind. Eng. 2010, 58, 307–316. [Google Scholar] [CrossRef]
  20. Zou, D.; Gao, L.; Wu, J.; Li, S. Novel global harmony search algorithm for unconstrained problems. Neurocomputing 2010, 73, 3308–3318. [Google Scholar] [CrossRef]
  21. Pan, Q.K.; Suganthan, P.N.; Tasgetiren, M.F.; Liang, J.J. A self-adaptive global best harmony search algorithm for continuous optimization problems. Appl. Math. Comput. 2010, 216, 830–848. [Google Scholar] [CrossRef]
  22. Valian, E.; Tavakoli, S.; Mohanna, S. An intelligent global harmony search approach to continuous optimization problems. Appl. Math. Comput. 2014, 232, 670–684. [Google Scholar] [CrossRef]
  23. Geem, Z.W.; Kim, J.H.; Loganathan, G.V. A new heuristic optimization algorithm: Harmony search. Simulation 2001, 76, 60–68. [Google Scholar] [CrossRef]
  24. Omran, M.G.H.; Mahdavi, M. Global-best harmony search. Appl. Math. Comput. 2008, 198, 643–656. [Google Scholar] [CrossRef]
  25. Mahdavi, M.; Fesanghary, M.; Damangir, E. An improved harmony search algorithm for solving optimization problems. Appl. Math. Comput. 2007, 188, 1567–1579. [Google Scholar] [CrossRef]
  26. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995. [Google Scholar]
  27. Eberhart, R.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; pp. 39–43. [Google Scholar]
  28. Chen, K.H.; Chen, L.F.; Su, C.T. A new particle swarm feature selection method for classification. J. Intell. Inf. Syst. 2014, 42, 507–530. [Google Scholar] [CrossRef]
  29. Petrović, M.; Vuković, N.; Mitić, M.; Miljković, Z. Integration of process planning and scheduling using chaotic particle swarm optimization algorithm. Expert Syst. Appl. 2016, 64, 569–588. [Google Scholar] [CrossRef]
  30. Khaled, N.; Hemayed, E.E.; Fayek, M.B. A GA-based approach for epipolar geometry estimation. Int. J. Pattern Recognit. Artif. Intell. 2013, 27, 1355014. [Google Scholar] [CrossRef]
  31. Metawaa, N.; Hassana, M.K.; Elhoseny, M. Genetic algorithm based model for optimizing bank lending decisions. Expert Syst. Appl. 2017, 80, 75–82. [Google Scholar] [CrossRef]
  32. Song, S. Design of distributed database systems: An iterative genetic algorithm. J. Intell. Inf. Syst. 2015, 45, 29–59. [Google Scholar] [CrossRef]
  33. Masters, T. Practical Neural Network Recipes in C++; Academic Press: Boston, MA, USA, 1994; ISBN 9780124790407. [Google Scholar]
  34. Looney, C.G. Advances in feedforward neural networks: Demystifying knowledge acquiring black boxes. IEEE Trans. Knowl. Data Eng. 1996, 8, 211–226. [Google Scholar] [CrossRef]
  35. Dowla, F.U.; Rogers, L.L. Solving Problems in Environmental Engineering and Geosciences with Artificial Neural Networks; MIT Press: Cambridge, MA, USA, 1995; ISBN 0262515726. [Google Scholar]
  36. Haykin, S. Neural Networks: A Comprehensive Foundation; Macmillan: New York, NY, USA, 1994; ISBN 0023527617. [Google Scholar]
  37. Zupan, J.; Gasteiger, J. Neural Networks for Chemists: An Introduction; VCH: New York, NY, USA, 1993; ISBN 3527286039. [Google Scholar]
  38. Rao, R.V.; Savsani, V.J.; Vakharia, D.P. Teaching–learning-based optimization: A novel method for constrained mechanical design optimization problems. Comput. Aided Des. 2011, 43, 303–315. [Google Scholar] [CrossRef]
  39. Li, Z.; Cao, Y.; Dai, L.V.; Yang, X.; Nguyen, T.T. Optimal Power Flow for Transmission Power Networks Using a Novel Metaheuristic Algorithm. Energies 2019, 12, 4310. [Google Scholar] [CrossRef] [Green Version]
  40. Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H. Harris hawks optimization: Algorithm and applications. Future Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
Figure 1. Typical neural-network structure.
Figure 1. Typical neural-network structure.
Mathematics 07 01187 g001
Figure 2. Relationship between metaheuristic algorithm and artificial neural network (ANN).
Figure 2. Relationship between metaheuristic algorithm and artificial neural network (ANN).
Mathematics 07 01187 g002
Figure 3. Procedure of Metaheuristic-ANN.
Figure 3. Procedure of Metaheuristic-ANN.
Mathematics 07 01187 g003
Figure 4. Comparison between music improvisation and engineering optimization.
Figure 4. Comparison between music improvisation and engineering optimization.
Mathematics 07 01187 g004
Figure 5. (a) Linear variation of pitch adjusting rate (PAR) with the iteration number; (b) nonlinear variation of bandwidth (BW) with the iteration number.
Figure 5. (a) Linear variation of pitch adjusting rate (PAR) with the iteration number; (b) nonlinear variation of bandwidth (BW) with the iteration number.
Mathematics 07 01187 g005
Figure 6. Straight linear and threshold linear strategies.
Figure 6. Straight linear and threshold linear strategies.
Mathematics 07 01187 g006
Figure 7. Exponential strategies.
Figure 7. Exponential strategies.
Mathematics 07 01187 g007
Figure 8. Concave cosine and convex cosine strategies.
Figure 8. Concave cosine and convex cosine strategies.
Mathematics 07 01187 g008
Figure 9. Example of the original data and the statistical data. (a) Original data; (b) statistical data.
Figure 9. Example of the original data and the statistical data. (a) Original data; (b) statistical data.
Mathematics 07 01187 g009
Figure 10. Multilayer perceptron structure.
Figure 10. Multilayer perceptron structure.
Mathematics 07 01187 g010
Figure 11. Typical convergence graph of six different forecasting systems. (a) K = 1; (b) K = 2; (c) K = 3; (d) K = 4; (e) K = 5; (f) K = 6; (g) K = 7; (h) K = 8; (i) K = 9; (j) K = 10.
Figure 11. Typical convergence graph of six different forecasting systems. (a) K = 1; (b) K = 2; (c) K = 3; (d) K = 4; (e) K = 5; (f) K = 6; (g) K = 7; (h) K = 8; (i) K = 9; (j) K = 10.
Mathematics 07 01187 g011aMathematics 07 01187 g011b
Table 1. Sixteen dynamic adjustment strategies.
Table 1. Sixteen dynamic adjustment strategies.
StrategyDescription
1. Straight_1Straight linear increasing strategy:
p m k = p m _ m i n + ( p m _ m a x p m _ m i n ) N I × k
2. Straight_2Straight linear decreasing strategy:
p m k = p m _ m a x + ( p m _ m i n p m _ m a x ) N I × k
3. Threshold_1Threshold linear prior increasing strategy:
p m k = { p m _ m i n + P m _ m a x P m _ m i n N I × 2 k       i f   k < N I / 2 p m _ m a x                                                 i f   k N I / 2
4. Threshold_2Threshold linear prior decreasing strategy:
p m k = { p m _ m a x + P m _ m i n P m _ m a x N I × 2 k       i f   k < N I / 2 p m _ m i n                                                 i f   k N I / 2
5. Threshold_3Threshold linear posterior increasing strategy:
p m k = { p m _ m i n                                                 i f   k < N I / 2 p m _ m i n + P m _ m a x P m _ m i n N I × 2 k       i f   k N I / 2
6. Threshold_4Threshold linear posterior decreasing strategy:
p m k = { p m _ m a x                                                 i f   k < N I / 2 p m _ m a x + P m _ m i n P m _ m a x N I × 2 k       i f   k N I / 2
7. Exponential_1Natural exponential increasing strategy:
p m k = p m _ m i n × e ( ln ( p m _ m a x p m _ m i n ) × k / N I )
8. Exponential_2Natural exponential decreasing strategy:
p m k = p m _ m a x × e ( ln ( p m _ m i n p m _ m a x ) × k / N I )
9. Exponential_3Exponential increasing strategy with m r = 0.01 :
p m k = p m _ m i n + ( p m _ m a x p m _ m i n ) × m r ( N I k ) / N I
10. Exponential_4Exponential decreasing strategy with m r = 0.01 :
p m k = p m _ m i n + ( p m _ m a x p m _ m i n ) × m r k / N I
11. Exponential_5Exponential increasing strategy with m r = 0.001 :
p m k = p m _ m i n + ( p m _ m a x p m _ m i n ) × m r ( N I k ) / N I
12. Exponential_6Exponential decreasing strategy with m r = 0.001 :
p m k = p m _ m i n + ( p m _ m a x p m _ m i n ) × m r k / N I
13. Cosine_1Concave cosine strategy with c c = 1 :
p m k = p m _ m a x + p m _ m i n 2 + p m _ m a x p m _ m i n 2 × cos k × c c × 2 π N I
14. Cosine_2Convex cosine strategy with c c = 1 :
p m k = p m _ m a x + p m _ m i n 2 p m _ m a x p m _ m i n 2 × cos k × c c × 2 π N I
15. Cosine_3Concave cosine strategy with c c = 3 :
p m k = p m _ m a x + p m _ m i n 2 + p m _ m a x p m _ m i n 2 × cos k × c c × 2 π N I
16. Cosine_4Convex cosine strategy with c c = 3 :
p m k = p m _ m a x + p m _ m i n 2 p m _ m a x p m _ m i n 2 × cos k × c c × 2 π N I
Table 2. Input and output variables.
Table 2. Input and output variables.
VariablesNameDescription
Input1. Gender ( A 1 )(1) Male
(2) Female
2. Education status ( A 2 )(1) Elementary school or illiterate
(2) Junior high school
(3) Senior high school
(4) College or university
3. Marriage status ( A 3 )(1) Unmarried
(2) Married
(3) Divorced
(4) Widowed
4. Affection of spouse ( A 4 ) (1) Unmarried
(2) Good
(3) Occasional conflict
(4) Alienation
5. Personal or family financial situation   ( A 5 ) (1) Poverty
(2) Barely living
(3) Well-to-do
(4) Rich
6. Job satisfaction ( A 6 ) (1) Good
(2) Normal
(3) Not Good
(4) Others
7. Crime ( A 7 ) The name of the crime for which they ended up in prison.
(1) Crime of embezzlement and dereliction of duty
(2) Crime of public danger
(3) Crime of forging the seal of documents and securities currency
(4) Crime of obstructing weathering
(5) Homicide
(6) Crime of injury
(7) Theft
(8) Impairment of freedom
(9) Regulations on the control of guns, shells, and knives
(10) Drug control regulations
(11) Others
8. Length of criminal experience ( A 8 ) The time between the first prosecution and the final prosecution.
(1) under 1 year (including)
(2) 1 to 5 years (including)
(3) 5 to 10 years (including)
(4) over 10 years
9. History of committing a crime ( A 9 ) (1) first offense
(2) recommit over 5 years
(3) recommit within 5 years (including)
10. The age of going to jail ( A 10 ) (1) 19 years old
(2) 20–29 years old
(3) 30–39 years old
(4) 40–49 years old
(5) 50–59 years old
(6) 60 years old
11. Total amount of going to jail ( A 11 )The cumulative number of going to jail.
12. The time of serving a sentence ( A 12 ) The total days between the imprisonment execution and the release.
(1) 180 days
(2) 181–365 days (1 year)
(3) 366–1095 days (3 years)
(4) 1096–1825 days (5 years)
(5) 1826 days
Output
(original)
Recidivism ( T )(1) No
(2) Yes
Output
(statistical)
Recidivism rate ( T )(1) Low
(2) High
Table 3. Trial and error experimental results of backpropagation network (BPN).
Table 3. Trial and error experimental results of backpropagation network (BPN).
TrainingValidationTesting
LR0.10.20.30.10.20.30.10.20.3
MT0.90.80.70.90.80.70.90.80.7
K = 13.37 × 10−13.42 × 10−13.48 × 10−13.59 × 10−13.60 × 10−13.63 × 10−13.61 × 10−13.65 × 10−13.70 × 10−1
K = 23.34 × 10−13.36 × 10−13.39 × 10−13.58 × 10−13.62 × 10−13.63 × 10−13.57 × 10−13.63 × 10−13.63 × 10−1
K = 33.31 × 10−13.38 × 10−13.46 × 10−13.57 × 10−13.59 × 10−13.64 × 10−13.61 × 10−13.62 × 10−13.69 × 10−1
K = 43.39 × 10−13.46 × 10−13.47 × 10−13.58 × 10−13.60 × 10−13.64 × 10−13.58 × 10−13.65 × 10−13.67 × 10−1
K = 53.40 × 10−13.44 × 10−13.53 × 10−13.61 × 10−13.64 × 10−13.69 × 10−13.63 × 10−13.65 × 10−13.72 × 10−1
K = 63.20 × 10−13.27 × 10−13.28 × 10−13.53 × 10−13.56 × 10−13.57 × 10−13.50 × 10−13.49 × 10−13.50 × 10−1
K = 73.12 × 10−13.17 × 10−13.19 × 10−13.41 × 10−13.49 × 10−13.53 × 10−13.47 × 10−13.47 × 10−13.49 × 10−1
K = 83.13 × 10−13.15 × 10−13.20 × 10−13.43 × 10−13.43 × 10−13.46 × 10−13.47 × 10−13.48 × 10−13.47 × 10−1
K = 93.10 × 10−13.15 × 10−13.14 × 10−13.47 × 10−13.55 × 10−13.56 × 10−13.48 × 10−13.50 × 10−13.57 × 10−1
K = 103.13 × 10−13.16 × 10−13.16 × 10−13.51 × 10−13.62 × 10−13.62 × 10−13.24 × 10−13.32 × 10−13.36 × 10−1
Mean3.25 × 10−13.29 × 10−13.33 × 10−13.53 × 10−13.57 × 10−13.60 × 10−13.52 × 10−13.55 × 10−13.58 × 10−1
Std1.53 × 10−41.69 × 10−42.34 × 10−45.15 × 10−54.16 × 10−54.51 × 10−51.32× 10−41.23 × 10−41.47 × 10−4
p-value-6.48 × 10−66.37 × 10−5-2.76 × 10−33.29 × 10−5-4.73 × 10−34.95 × 10−4
Table 4. Parameters of compared harmony search (HS) algorithms.
Table 4. Parameters of compared harmony search (HS) algorithms.
Algorithmm 1HMCR 2PAR 3BW 4LP 5 p m   6 NI 7n 8
HS50.90.30.0110,00030
IHS50.9 P A R m i n = 0.01
P A R m a x = 0.99
B W m a x = ( x j U x j L ) / 20
B W m i n = 0.0001
10,00030
SGHS5 H M C R m = 0.98 P A R m = 0.9 B W m a x = ( x j U x j L ) / 10
B W m i n = 0.0001
10010,00030
NGHS50.0510,00030
DANGHS5 P m i n = 0.001
P m a x = 0.100
10,00030
1 m: the harmony memory size; 2 HMCR: the harmony memory considering rate; 3 PAR: the pitch adjusting rate; 4 BW: the bandwidth; 5 LP: the learning period; 6 p m : the genetic mutation probability; 7 NI: the maximum number of iterations; 8 n: the total number of independent experiments.
Table 5. Experimental results of learning dataset of 16 strategies in the DANGHS-ANN systems.
Table 5. Experimental results of learning dataset of 16 strategies in the DANGHS-ANN systems.
Adjustment
Strategy
K = 1K = 2K = 3K = 4K = 5K = 6K = 7K = 8K = 9K = 10MeanStdp-Value
Straight_13.2224 × 10−13.2045 × 10−13.1967 × 10−13.1983 × 10−13.2095 × 10−13.2206 × 10−13.2002 × 10−13.1880 × 10−13.1785 × 10−13.1812 × 10−13.2000 × 10−12.2328 × 10−60.0000
Straight_23.1329 × 10−13.1226 × 10−13.1410 × 10−13.1216 × 10−13.1125 × 10−13.1393 × 10−13.1247 × 10−13.1013 × 10−13.0923 × 10−13.0959 × 10−13.1184 × 10−13.0458 × 10−60.0803
Threshold_13.2302 × 10−13.2086 × 10−13.2260 × 10−13.2105 × 10−13.2164 × 10−13.2066 × 10−13.2231 × 10−13.1788 × 10−13.2048 × 10−13.1847 × 10−13.2090 × 10−12.7978 × 10−60.0000
Threshold_23.1543 × 10−13.1442 × 10−13.1369 × 10−13.1388 × 10−13.1590 × 10−13.1322 × 10−13.1462 × 10−13.1297 × 10−13.1091 × 10−13.1115 × 10−13.1362 × 10−12.6869 × 10−60.0000
Threshold_33.2455 × 10−13.2504 × 10−13.2758 × 10−13.2493 × 10−13.2344 × 10−13.2536 × 10−13.2393 × 10−13.2399 × 10−13.2304 × 10−13.2106 × 10−13.2429 × 10−12.8818 × 10−60.0000
Threshold_43.1190 × 10−13.1194 × 10−13.1240 × 10−13.1248 × 10−13.1201 × 10−13.1131 × 10−13.1193 × 10−13.0977 × 10−13.0985 × 10−13.0960 × 10−13.1132 × 10−11.2896 × 10−6-
Exponential_13.2783 × 10−13.2590 × 10−13.2545 × 10−13.2444 × 10−13.2537 × 10−13.2610 × 10−13.2571 × 10−13.2314 × 10−13.2227 × 10−13.2314 × 10−13.2493 × 10−12.8363 × 10−60.0000
Exponential_23.1545 × 10−13.1680 × 10−13.1691 × 10−13.1674 × 10−13.1451 × 10−13.1726 × 10−13.1569 × 10−13.1341 × 10−13.1087 × 10−13.1194 × 10−13.1496 × 10−14.9831 × 10−60.0000
Exponential_33.2508 × 10−13.2614 × 10−13.2427 × 10−13.2428 × 10−13.2498 × 10−13.2655 × 10−13.2378 × 10−13.2595 × 10−13.2178 × 10−13.2161 × 10−13.2444 × 10−12.8763 × 10−60.0000
Exponential_43.1474 × 10−13.1436 × 10−13.1577 × 10−13.1473 × 10−13.1433 × 10−13.1472 × 10−13.1518 × 10−13.1390 × 10−13.1197 × 10−13.1098 × 10−13.1407 × 10−12.1770 × 10−60.0000
Exponential_53.2805 × 10−13.2650 × 10−13.2835 × 10−13.2599 × 10−13.2738 × 10−13.2415 × 10−13.2708 × 10−13.2527 × 10−13.2416 × 10−13.2602 × 10−13.2629 × 10−12.1714 × 10−60.0000
Exponential_63.1707 × 10−13.1588 × 10−13.1800 × 10−13.1653 × 10−13.1622 × 10−13.1787 × 10−13.1793 × 10−13.1428 × 10−13.1315 × 10−13.1343 × 10−13.1604 × 10−13.3797 × 10−60.0000
Cosine_13.1494 × 10−13.1517 × 10−13.1399 × 10−13.1446 × 10−13.1385 × 10−13.1344 × 10−13.1407 × 10−13.1205 × 10−13.1102 × 10−13.0983 × 10−13.1328 × 10−13.0754 × 10−60.0000
Cosine_23.1846 × 10−13.1709 × 10−13.1760 × 10−13.1545 × 10−13.1765 × 10−13.1624 × 10−13.1913 × 10−13.1698 × 10−13.1509 × 10−13.1466 × 10−13.1683 × 10−12.1483 × 10−60.0000
Cosine_33.1599 × 10−13.1567 × 10−13.1668 × 10−13.1618 × 10−13.1529 × 10−13.1675 × 10−13.1805 × 10−13.1431 × 10−13.1281 × 10−13.1388 × 10−13.1556 × 10−12.3853 × 10−60.0000
Cosine_43.1636 × 10−13.1981 × 10−13.1802 × 10−13.1638 × 10−13.1922 × 10−13.1721 × 10−13.1902 × 10−13.1489 × 10−13.1464 × 10−13.1321 × 10−13.1688 × 10−14.7776 × 10−60.0000
Table 6. Experimental results of testing dataset of 16 strategies in the DANGHS-ANN systems.
Table 6. Experimental results of testing dataset of 16 strategies in the DANGHS-ANN systems.
Adjustment
Strategy
K = 1K = 2K = 3K = 4K = 5K = 6K = 7K = 8K = 9K = 10MeanStdp-Value
Straight_13.5066 × 10−13.5110 × 10−13.5071 × 10−13.4900 × 10−13.5398 × 10−13.5081 × 10−13.5212 × 10−13.5622 × 10−13.5442 × 10−13.3011 × 10−13.4991 × 10−15.3088 × 10−50.0000
Straight_23.4168 × 10−13.4021 × 10−13.4280 × 10−13.3963 × 10−13.4012 × 10−13.4187 × 10−13.4139 × 10−13.4207 × 10−13.4041 × 10−13.2265 × 10−13.3928 × 10−13.5199 × 10−50.2606
Threshold_13.5188 × 10−13.4963 × 10−13.5129 × 10−13.4968 × 10−13.5242 × 10−13.4788 × 10−13.5266 × 10−13.5076 × 10−13.5149 × 10−13.3353 × 10−13.4912 × 10−13.2137 × 10−50.0000
Threshold_23.4563 × 10−13.4251 × 10−13.4207 × 10−13.4412 × 10−13.4490 × 10−13.4095 × 10−13.4456 × 10−13.4797 × 10−13.4602 × 10−13.2445 × 10−13.4232 × 10−14.3649 × 10−50.0000
Threshold_33.5368 × 10−13.5051 × 10−13.5876 × 10−13.5549 × 10−13.5183 × 10−13.5188 × 10−13.5432 × 10−13.5539 × 10−13.5471 × 10−13.2918 × 10−13.5158 × 10−16.7327 × 10−50.0000
Threshold_43.3934 × 10−13.4261 × 10−13.4036 × 10−13.4222 × 10−13.4295 × 10−13.3953 × 10−13.4139 × 10−13.4031 × 10−13.4017 × 10−13.1889 × 10−13.3878 × 10−15.0441 × 10−5-
Exponential_13.6271 × 10−13.5754 × 10−13.5549 × 10−13.5520 × 10−13.5925 × 10−13.5408 × 10−13.5735 × 10−13.5573 × 10−13.5422 × 10−13.3709 × 10−13.5487 × 10−14.5814 × 10−50.0000
Exponential_23.4563 × 10−13.4461 × 10−13.4592 × 10−13.4470 × 10−13.4231 × 10−13.4763 × 10−13.4841 × 10−13.4739 × 10−13.4563 × 10−13.2099 × 10−13.4332 × 10−16.4665 × 10−50.0000
Exponential_33.5583 × 10−13.5798 × 10−13.5647 × 10−13.5295 × 10−13.5739 × 10−13.5881 × 10−13.5686 × 10−13.6232 × 10−13.5364 × 10−13.3621 × 10−13.5485 × 10−14.9757 × 10−50.0000
Exponential_43.4256 × 10−13.4505 × 10−13.4705 × 10−13.4592 × 10−13.4397 × 10−13.4290 × 10−13.4241 × 10−13.4959 × 10−13.4275 × 10−13.2113 × 10−13.4233 × 10−16.0866 × 10−50.0000
Exponential_53.5939 × 10−13.5461 × 10−13.5691 × 10−13.5598 × 10−13.5588 × 10−13.5500 × 10−13.5608 × 10−13.6105 × 10−13.5588 × 10−13.3821 × 10−13.5490 × 10−13.8344 × 10−50.0000
Exponential_63.4778 × 10−13.4470 × 10−13.4778 × 10−13.4758 × 10−13.4456 × 10−13.4534 × 10−13.4905 × 10−13.4568 × 10−13.4871 × 10−13.2650 × 10−13.4477 × 10−14.3867 × 10−50.0000
Cosine_13.4412 × 10−13.4422 × 10−13.4168 × 10−13.4451 × 10−13.4505 × 10−13.4114 × 10−13.4363 × 10−13.4427 × 10−13.4441 × 10−13.2020 × 10−13.4132 × 10−15.6652 × 10−50.0000
Cosine_23.4744 × 10−13.4910 × 10−13.4578 × 10−13.4231 × 10−13.4549 × 10−13.4295 × 10−13.4822 × 10−13.5232 × 10−13.4622 × 10−13.2552 × 10−13.4453 × 10−15.3057 × 10−50.0000
Cosine_33.4490 × 10−13.4578 × 10−13.4685 × 10−13.4900 × 10−13.4563 × 10−13.4661 × 10−13.5110 × 10−13.4505 × 10−13.4632 × 10−13.2733 × 10−13.4486 × 10−14.1550 × 10−50.0000
Cosine_43.4568 × 10−13.5100 × 10−13.4939 × 10−13.4671 × 10−13.4680 × 10−13.4754 × 10−13.4993 × 10−13.4822 × 10−13.4685 × 10−13.2299 × 10−13.4551 × 10−16.5375 × 10−50.0000
Table 7. Experimental results of learning dataset of six forecasting systems.
Table 7. Experimental results of learning dataset of six forecasting systems.
Forecasting
System
K = 1K = 2K = 3K = 4K = 5K = 6K = 7K = 8K = 9K = 10MeanStdp-Value
BPN3.4310 × 10−13.4085 × 10−13.3810 × 10−13.4446 × 10−13.4561 × 10−13.2902 × 10−13.2040 × 10−13.2120 × 10−13.2023 × 10−13.2342 × 10−13.3264 × 10−11.0787 × 10−20.0000
HS-ANN3.1379 × 10−13.1470 × 10−13.1366 × 10−13.1551 × 10−13.1467 × 10−13.1442 × 10−13.1317 × 10−13.1245 × 10−13.1218 × 10−13.1114 × 10−13.1357 × 10−11.3453 × 10−30.0000
IHS-ANN3.1465 × 10−13.1377 × 10−13.1412 × 10−13.1447 × 10−13.1356 × 10−13.1419 × 10−13.1332 × 10−13.1191 × 10−13.1049 × 10−13.1097 × 10−13.1315 × 10−11.4893 × 10−30.0000
SGHS-ANN3.4227 × 10−13.4464 × 10−13.4309 × 10−13.4375 × 10−13.4198 × 10−13.4377 × 10−13.4365 × 10−13.4092 × 10−13.4061 × 10−13.4048 × 10−13.4252 × 10−11.4864 × 10−30.0000
NGHS-ANN3.1622 × 10−13.1682 × 10−13.1586 × 10−13.1558 × 10−13.1583 × 10−13.1675 × 10−13.1473 × 10−13.1570 × 10−13.1313 × 10−13.1390 × 10−13.1545 × 10−11.1967 × 10−30.0000
DANGHS-ANN3.1190 × 10−13.1194 × 10−13.1240 × 10−13.1248 × 10−13.1201 × 10−13.1131 × 10−13.1193 × 10−13.0977 × 10−13.0985 × 10−13.0960 × 10−13.1132 × 10−11.1356 × 10−3-
Table 8. Experimental results of testing dataset of six forecasting systems.
Table 8. Experimental results of testing dataset of six forecasting systems.
Forecasting
System
K = 1K = 2K = 3K = 4K = 5K = 6K = 7K = 8K = 9K = 10MeanStdp-Value
BPN3.6140 × 10−13.5652 × 10−13.6091 × 10−13.5837 × 10−13.6281 × 10−13.4978 × 10−13.4675 × 10−13.4705 × 10−13.4768 × 10−13.2435 × 10−13.5156 × 10−11.1467 × 10−20.0000
HS-ANN3.3973 × 10−13.4290 × 10−13.4153 × 10−13.4368 × 10−13.4383 × 10−13.4519 × 10−13.4358 × 10−13.4495 × 10−13.4524 × 10−13.2201 × 10−13.4126 × 10−16.9794 × 10−30.0018
IHS-ANN3.4344 × 10−13.3982 × 10−13.4510 × 10−13.4329 × 10−13.4324 × 10−13.4251 × 10−13.4056 × 10−13.4197 × 10−13.4241 × 10−13.2157 × 10−13.4039 × 10−16.7783 × 10−30.0258
SGHS-ANN3.7355 × 10−13.7618 × 10−13.7696 × 10−13.7491 × 10−13.7609 × 10−13.7531 × 10−13.7443 × 10−13.7413 × 10−13.7082 × 10−13.5725 × 10−13.7296 × 10−15.7819 × 10−30.0000
NGHS-ANN3.4563 × 10−13.4558 × 10−13.4578 × 10−13.4514 × 10−13.4788 × 10−13.4680 × 10−13.4832 × 10−13.5046 × 10−13.4300 × 10−13.2352 × 10−13.4421 × 10−17.5482 × 10−30.0000
DANGHS-ANN3.3934 × 10−1 3.4261 × 10−13.4036 × 10−13.4222 × 10−13.4295 × 10−13.3953 × 10−13.4139 × 10−13.4031 × 10−13.4017 × 10−13.1889 × 10−13.3878 × 10−17.1022 × 10−3-
Table 9. Comparison between the actual and predictive recidivism in BPN system.
Table 9. Comparison between the actual and predictive recidivism in BPN system.
ActualLow Recidivism RateHigh Recidivism RateTotal
AmountRatioAmountRatioAmountRatio
Predictive
Low176666.59%88633.41%2652100.00%
High133531.99%283868.01%4173100.00%
Table 10. Comparison between the actual and predictive recidivism in HS-ANN system.
Table 10. Comparison between the actual and predictive recidivism in HS-ANN system.
ActualLow Recidivism RateHigh Recidivism RateTotal
AmountRatioAmountRatioAmountRatio
Predictive
Low175567.60%84132.40%2596100.00%
High134631.83%288368.17%4229100.00%
Table 11. Comparison between the actual and predictive recidivism in IHS-ANN system.
Table 11. Comparison between the actual and predictive recidivism in IHS-ANN system.
ActualLow Recidivism RateHigh Recidivism RateTotal
AmountRatioAmountRatioAmountRatio
Predictive
Low163068.00%76732.00%2397100.00%
High147133.22%295766.78%4428100.00%
Table 12. Comparison between the actual and predictive recidivism in SGHS-ANN system.
Table 12. Comparison between the actual and predictive recidivism in SGHS-ANN system.
ActualLow Recidivism RateHigh Recidivism RateTotal
AmountRatioAmountRatioAmountRatio
Predictive
Low187666.83%93133.17%2807100.00%
High122530.49%279369.51%4018100.00%
Table 13. Comparison between the actual and predictive recidivism in NGHS-ANN system.
Table 13. Comparison between the actual and predictive recidivism in NGHS-ANN system.
ActualLow Recidivism RateHigh Recidivism RateTotal
AmountRatioAmountRatioAmountRatio
Predictive
Low160967.01%79232.99%2401100.00%
High149233.73%293266.27%4424100.00%
Table 14. Comparison between the actual and predictive recidivism in DANGHS-ANN system.
Table 14. Comparison between the actual and predictive recidivism in DANGHS-ANN system.
ActualLow Recidivism RateHigh Recidivism RateTotal
AmountRatioAmountRatioAmountRatio
Predictive
Low171068.18%79831.82%2508100.00%
High139132.22%292667.78%4317100.00%

Share and Cite

MDPI and ACS Style

Shih, P.-C.; Chiu, C.-Y.; Chou, C.-H. Using Dynamic Adjusting NGHS-ANN for Predicting the Recidivism Rate of Commuted Prisoners. Mathematics 2019, 7, 1187. https://doi.org/10.3390/math7121187

AMA Style

Shih P-C, Chiu C-Y, Chou C-H. Using Dynamic Adjusting NGHS-ANN for Predicting the Recidivism Rate of Commuted Prisoners. Mathematics. 2019; 7(12):1187. https://doi.org/10.3390/math7121187

Chicago/Turabian Style

Shih, Po-Chou, Chui-Yu Chiu, and Chi-Hsun Chou. 2019. "Using Dynamic Adjusting NGHS-ANN for Predicting the Recidivism Rate of Commuted Prisoners" Mathematics 7, no. 12: 1187. https://doi.org/10.3390/math7121187

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop