1. Introduction
Forests are not only important for sustainability but in fact indispensable resources for human life on Earth. Hence, forest management has been the subject of much research.
  1.1. Context to Forest Management
Forests are complex systems that provide a wide range of ecosystem services, contributing to environmental sustainability and resilience. However, the quantity and diversity of these services depend heavily on the type of forest management applied and the initial conditions of the landscape.
In Portugal, particularly in the central region, the forest landscapes are mainly composed of planted forests owned by many small non-industrial private landowners, whose primary objective is wood production [
1,
2].
The extensive use of maritime pine and eucalyptus, along with agricultural abandonment and lack of active forest management, has led to large areas of forest monocultures and scrublands. These conditions, combined with orographic and climatic factors typical of the region, as well as climate change, have resulted in widespread forest fires and the spread of invasive woody species. This has caused the loss of ecological and economic value of forest areas and their degradation, significantly altering the landscape and compromising sustainability.
In response to this concern, Portugal has adopted the recommendation of several international bodies and EU policymakers to use a landscape approach to forest management [
3,
4,
5]. This approach aims to better tailor policy measures and forest management to meet the expectations of local stakeholders and communities and to promote the resilience of forested landscapes [
5,
6].
A new policy instrument known as AIGP (Integrated Landscape Management Areas) has emerged to promote forest management at the landscape level, proposing a multifunctional model for forest spatial planning. This model is based on functional zoning, which divides the territory into three categories: (1) the resilience structure, which aims to reduce fire hazards; (2) the ecological structure, to ensure the environmental services of the landscape and the conservation of natural areas and biodiversity; and (3) the remaining area, that does not have any designation but will be called a matrix in this work, where wood production and other forest products with economic value can be pursued to meet the needs of forest landowners.
The first two structures are designed to adhere to the spatial environmental restrictions specified in the hierarchical planning system. In contrast, in the matrix, the use of a wide range of forest management models and silvicultural options is encouraged to promote landscape heterogeneity and structural diversity.
It should be noted that this model shares many similarities with those used in other geographical contexts, such as the TRIAD approach, which aims to minimise negative environmental impacts while maintaining the capacity to produce wood and generate economic income [
7]. The TRIAD approach proposes creating three zones: one dedicated to nature conservation, another to intensive forestry, and a third where more extensive and diverse forestry practices are applied. Like the AIGP model, the TRIAD approach adopts a landscape-orientated perspective and explicit spatial planning, encouraging the adoption of multiple forest management types across the landscape in a portfolio of management strategies [
8] to cope with unexpected results brought by climate change [
9,
10].
Diversifying management strategies can improve the ecological and economic resilience of the landscape, helping to reduce the likelihood of catastrophic events and the economic dependence on a minimal range of forest products [
10].
Management strategies should integrate silvicultural practices to enhance forest resilience to climate change. These practices, among other adaptation measures, include species diversification, the establishment of mixed stands, the promotion of natural regeneration, adjustments to harvesting ages, and the implementation of shelterwood and selective cuts [
11,
12,
13,
14,
15]. The diversification of silvicultural practices makes the process of selecting management alternatives significantly more complex, requiring careful evaluation of ecological, economic, and operational trade-offs.
Furthermore, historical, political, and social factors significantly influence the effectiveness of forest adaptation strategies, often limiting the universal applicability of findings from global studies in addressing global change [
16]. Therefore, it is essential to develop locally tailored adaptation measures that account for specific regional contexts and stakeholder needs, and the long-term sustainability of forest ecosystems.
This adds another layer of complexity to the already intricate challenge of designing forest management compositions and drafting management plans [
17,
18]. Many authors emphasise the importance of the learning process involved, highlighting its fundamental contribution to the quality and feasibility of the solutions developed [
19,
20].
The specific social context in Portugal amplifies the importance of the learning process in forest management. Exploring new approaches to adaptive silviculture for forest stands and devising comprehensive management plans for areas with multiple forest owners are indispensable. The inherent complexity of the situation makes it difficult for forest owners and decision-makers to clearly define their preferences without understanding how different management alternatives align with their objectives and potential outcomes.
  1.2. Managing Forest Stands Using Genetic Algorithms
Traditional optimisation techniques, such as linear and dynamic programming, have long been applied to multi-objective problems such as harvest scheduling in forest management, with studies by Hoganson et al. [
21] and Borges et al. [
22] demonstrating their effectiveness in simpler scenarios. However, these methods often struggle to handle larger, more complex problems, where balancing multiple, often conflicting objectives, such as maximising timber yield while minimising environmental impact, becomes computationally prohibitive [
23].
In order to address these limitations, heuristic approaches have been employed as a practical alternative, offering near-optimal solutions with manageable computational costs [
23,
24]. Among these, Genetic Algorithms (GAs) have emerged as a powerful and widely used tool to solve complex, large-scale, multi-objective optimisation problems [
25]. GAs utilise evolutionary principles to improve the populations of potential solutions over multiple generations [
24]. In contrast to single-objective methods, which consolidate various objectives into a single metric [
26], GAs generate a set of solutions that independently balance multiple objectives. As a result, this approach is well suited to scenarios where the optimal solution may be represented by more than one alternative. Thus, in standard multi-objective optimisation, all objectives are distinct measurable criteria, each expressing some aspect requiring optimisation. Assuming all important criteria have been included as objectives, one may be unsure about their relative importance to a decision-maker (and/or about how to normalize them), but one can be certain that the ‘ideal’ solution will be Pareto-optimal.
A key development in this field is the Non-Dominated Sorting Genetic Algorithm II (NSGA-II), which was introduced by [
27]. The effectiveness of NSGA-II in producing a range of Pareto-optimal solutions has been demonstrated, allowing decision-makers to examine a variety of trade-offs rather than a single, potentially biased solution. The core mechanisms of the algorithm, namely, non-dominated sorting and crowding distance, assist in the preservation of diversity within the solution set, thus ensuring a broad selection of high-quality outcomes [
27].
This work is based on the underlying principles of NSGA-II and introduces a decision support tool for forest managers. This tool allows for the evaluation and optimisation of management strategies prior to the implementation of a specific plan. The tool has been designed with usability in mind and is accessible via a web-based platform, consisting of two integrated modules. The initial module simulates potential management options for each forest stand, allowing managers to evaluate prospective actions and their consequences for a variety of objectives. The second module employs a modified version of NSGA-II, referred to as custom NSGA-II, which incorporates a specialised mutation operator designed to enhance the generation of diverse solutions tailored to the specific challenges of forest management.
The custom NSGA-II algorithm is evaluated and compared with the standard NSGA-II by applying both methods to a practical example involving forest management. This evaluation is performed with a range of different parameter settings to assess their performance. The forest planning problem under consideration involves two primary goals that the algorithm aims to resolve: maximising timber harvest volume and minimising the standard deviation in harvest volumes across the planning horizon for all stands. A critical constraint ensures that all harvest volumes remain valid (i.e., greater than zero).
The analysis considers a range of parameter configurations, enabling a comprehensive evaluation of the efficiency of the algorithms employed and the diversity of the solutions generated. The outcome of this process is a method that enables forest managers to explore various combinations of stands, assess trade-offs, and analyse different harvest ages over time. It provides an efficient and user-friendly way to optimise forest management practices, helping decision-makers make informed decisions.
  3. Problem Definition
In this research, we address a problem that involves a forest composed of numerous and diverse stands, each containing up to three different species with varying current and harvest ages. Each stand may potentially support a variety of different compositions, allowing forest planners to introduce new species into the stand if desired.
The combinatorial problem analysed here aims to find the optimal combinations of stands and harvest ages to achieve two objectives simultaneously: maximising the total timber harvest and minimising the standard deviation of the total timber harvest for each period.
The optimisation goals focus on treating each stand with consideration for the collective objectives of the entire forest. The mathematical formulation of the problem may be described as
      
      where
-  volume harvested during the planning horizon for all stands N; 
-  number of stands; 
-  number of species; 
-  number of harvest periods; 
-  timber volume harvested during period t for species x of stand n; 
-  standard deviation value during the planning horizon for all stands N; 
-  stand’s timber volume for period t; 
-  mean of the sample for period t. 
The timber volumes for each stand 
n can be obtained using the following formula:
      where
-  proportion of usable area in stand n; 
-  ratio of species p in stand n; 
-  volume of the stand for period t; 
-  volume obtained from thinning. 
This bi-objective optimisation problem is complemented by a constraint where the total volume cannot be zero, as this represents an invalid solution: 
. Moreover, in the course of validating the solutions generated by the algorithm, any solution that exhibited harvest age values not within the specified ranges was considered to violate the constraints, according to the first objective function (
1).
  3.1. Strategy for Multi-Objective Forest Management
The design of an evolutionary strategy to address the multi-objective forest management problem outlined in Equations (
1)–(
3) involved a comprehensive analysis of the solution fitness, the selection of genetic operators, and the fine-tuning of the parameters.
A forest consists of a set of stands, each occupying a polygonal area delineated on a map. In the case study, the forest corresponds to the landscape matrix, where the user defines on the map the set of stands that make up the forest (see 
Figure 1). Each stand has a total area and a usable area, which represents the portion of the total area that can be effectively used for tree coverage.
Each stand  possesses specific characteristics that influence the structure of the algorithm.
A stand is considered homogeneous with respect to the attributes used to simulate its growth and yield. These attributes include the composition of the stand, that is, the species present (up to three species) and their respective proportions of occupancy, together with the yield class and age class for each species. These values can be obtained from forest inventories conducted for existing stands or estimated by the user for newly established stands. For newly established stands, the user can define up to three alternative compositions.
Figure 1 illustrates a hypothetical forest composed of two stands: Stand 1 and stand 2.
 For stand 1, two alternative compositions are considered, representing two different combinations of species and their respective occupancy proportions. In the first composition, the user specified a mixed stand comprising three species (combination 1), while in the second composition, the user defined a mixed stand with two species (combination 2). In contrast, stand 2 is defined as a pure stand, consisting of only one species.
Each species can be represented by the combination , where  denotes the stand i (with , where m is the total number of stands),  denotes the composition of stand j (with , where n is the maximum number of compositions of the stands), and  denotes species k (with ). Here, the variables i, j, and k are indices used to identify specific stands, compositions, and species, respectively.
In addition to the basic parameters characterising the stands, forest management involves a sequence of interventions throughout the life cycle of the stands (referred to as the silvicultural model) to achieve the desired objectives. This model incorporates operations such as regeneration, thinning, the type of harvest cut (clearcut or shelterwood), and harvest age. These interventions are critical because they represent the management options with the greatest influence on both the volume production and the harvest schedule. The user of the Web platform can define these management interventions for each species.
Yield tables are crucial for forest management, as they provide valuable information on tree growth patterns and potential productivity in different forest stands. Forest managers and practitioners use these models to make decisions about the management of individual tree stands or entire estates, forecasting production levels, making commitments to the timber markets, and planning forest operations [
60]. Yield classes and volume were estimated using yield tables, which were adjusted to provide volume and thinning estimates at consistent intervals. For each species, three quality classes were considered: high, medium, and low. The yield tables from Diéguez-Aranda et al. [
61] were used for 
Pseudotsuga menziesii, 
Quercus robur, and 
Betula alba, while the tables from Santos et al. [
62] and Patrício [
63] were applied for 
Pinus pinaster and 
Castanea sativa.
Since the yield tables were originally designed for pure stands, their use in mixed stands is limited to mixed stands by patches or groups. Consequently, each species is assigned an independent silvicultural model, which includes defining the minimum and maximum harvest ages that must be considered when generating alternative solutions with the algorithm. It is acknowledged that both harvest age and harvest type can significantly influence the results.
In addition, the problem operates within a specified time frame. The planning horizon can be set to 50, 100, or 200 years, with harvest intervals of either 5 or 10 years.
Preliminary experiments showed enough variable results to allow ongoing improvements to the second project module. At an early stage, modifications were made to enhance the primary structure of the problem, and subsequently adjustments were introduced to optimise the genetic operators. These refinements improved the performance of the algorithm and reduced its computational demands. Reeves et al. [
41] highlighted that Genetic Algorithms are naturally adaptable, allowing easy modifications for changes in the original problem.
  3.2. Data Structure—Chromosome Representation
The Non-Dominated Sorting Genetic Algorithm (NSGA-II), implemented through the PyMoo package (version 0.6.1.3) in Python [
64], was used to find Pareto-optimal solutions, where improving one objective cannot occur without worsening another.
NSGA-II is a state-of-the-art Multi-objective Genetic Algorithm (MOGA) designed to overcome the limitations of classical optimisation methods, such as high computational complexity, non-elitism, and the need to specify a sharing parameter [
27]. By integrating elitism, NSGA-II preserves the best solutions from the previous generation while employing genetic operators, mutation, and crossover to generate a new population. This approach not only accelerates convergence towards the optimal solution, but also enhances the overall efficiency of the search process.
This algorithm effectively manages optimisation constraints and maintains diversity in the population by using a crowded comparison operator. The solutions are initially ranked according to dominance, and then sorted according to crowding distance, which contributes to an efficient ranking system that reduces computational complexity [
25,
65].
The approach carried out here highlights the efficient use of multi-objective optimisation algorithms to meet specific requirements and goals. Even in the face of strict limits, NSGA-II is able to generate various solutions for a forest management problem.
For the application of NSGA-II, the problem was encoded as follows. The Genetic Algorithm uses a chromosome of length 
L, where 
L is determined by the number of compositions, the number of species in each composition, and the defined harvest age periods for each species. Specifically, the length of the chromosome 
L is given by
        
        where 
C is the number of compositions, 
I is the number of species in each composition 
j, and T is the number of harvest periods.
For example, consider the following values:
Using the formula designated as Equation (
4), the following calculation is obtained:
Sum of species across all compositions:
Chromosome length calculation:
As a result, for this test configuration, the length of chromosome L is 63.
Each cell in this chromosome represents a gene, allowing the algorithm to evaluate and optimise based on the encoded information. Although problems typically solved with the PyMoo package are continuous in nature, it is also possible to use other types of variables [
64]. In this case, a binary gene type is used. The first set of cells on the chromosome, which represent the stand’s alternatives, are subjected to a constraint that allows only one composition per stand to be selected. This constraint is applied across all stands.
Figure 2 illustrates the arrangement of genes within the problem chromosome, as well as the application of genetic operators such as crossover and mutation.
 Each gene in 
Figure 2 is encoded by the combination 
:
-  represents stand i, where , with m being the total number of stands. 
-  represents stand composition j, where , with n being the maximum number of stand compositions. 
-  represents species k, where , with r being the maximum number of species. 
In this context
        
- i, j, and k are index variables used to identify the stands, compositions, and species, respectively. 
- m, n, and r are constants denoting the maximum number of stands, compositions, and species in the system. 
Subsequently, for each species present in the compositions, the corresponding gene is repeated t times, where T represents the number of periods.
For each group of periods, only one cell can represent the harvest age, ensuring that only one period is designated as the harvest age for that species according to the age range previously defined.
In the course of developing and implementing the NSGA-II algorithm to address this particular problem, it became evident that the scale of the problem posed significant challenges. The complexity of the problem was evidenced by the fact that each chromosome, for a simple forest scenario, consisted of more than 100 genes. This complexity proved detrimental to the algorithm, which was unable to achieve an acceptable ratio of valid to invalid solutions. This issue was particularly notable during the initial stages of the algorithm’s execution, with the need for more than 1000 iterations before the algorithm reached an optimal solution.
Therefore, it was established that the algorithm required modification in order to enhance the generation of valid solutions.
This custom NSGA-II algorithm differs from those commonly used in forest management problems, as it accommodates the presence of composition options for each stand [
23], harvest-age limits, and a custom initial population and mutation operator. Furthermore, the algorithm is integrated into a novel Web-based platform, facilitating broad accessibility for any user.
The following sections provide a comprehensive analysis of the alterations made to the original NSGA-II algorithm, as outlined by the pseudocode in Algorithm 1.
        
| Algorithm 1 Custom NSGA-II Procedure | 
| Require: , g,     ▹ members evolved over g generations to solve 1:Initialize Population 2:Generate custom population of  based on problem constraints3:Calculate Objectives  and 4:Evaluate Constraint ()5:Assign Rank based on Pareto dominance6:Generate Child Population:7:    Binary Tournament Selection8:    Crossover and custom Mutation9:for  to g do10:    for each Parent and Child in Population do11:        Assign Rank based on Pareto dominance12:        Generate sets of non-dominated solutions13:        Determine Crowding distance14:        Add solutions to next generation starting from the first front until  individuals15:    end for16:    Select points on the lower front with high crowding distance17:    Create next generation18:        Binary Tournament Selection19:        Crossover and custom Mutation20:end for21:Output Results22:    Extract and plot best solutions23:    Plot convergence and Pareto front
 | 
  3.3. Creation of an Initial Population
A reduction in population size can result in insufficient coverage of the solution space, which may lead to inadequate exploration and an increased risk of premature convergence. On the other hand, an excessively large population can result in a significant increase in computational costs without a corresponding improvement in solution quality [
41].
Preliminary tests showed that populations with fewer than 50 individuals exhibited a lack of diversity and frequently converged prematurely, resulting in an insufficient variety of solutions. In contrast, populations of over 500 chromosomes exhibited a considerable increase in computational costs without a corresponding improvement in diversity or the number of solutions, compared to a population size of approximately 200.
To further enhance the performance of the original NSGA-II, a customised operator was developed to generate the initial population (lines 1–2 of the Algorithm 1), replacing the float random sampling provided by the PyMoo framework for NSGA-II [
66]. This tailored approach assigns random binary values to each gene to form a chromosome, guaranteeing validity by ensuring that only one harvest age is selected for each species, rather than multiple. This effectively addresses a key issue in the original NSGA-II approach, where a higher proportion of invalid solutions were generated than valid ones.
  3.4. Fitness Evaluation
Once the population has been generated, either by random initialisation or as a result of a previous generation (lines 1–2, 6, 17 of the Algorithm 1), it is necessary to evaluate each individual. During the fitness evaluation phase, the algorithm calculates the values of two competing objectives for each individual: one to be maximised (Equation (
1)) and the other to be minimised (Equation (
2)).
In addition to this analysis, the individuals are classified as feasible or infeasible, based on their conformance to the constraints specific to the problem, such as the ranges of valid harvest ages. Subsequently, feasible solutions are prioritised during the sorting process.
The NSGA-II algorithm then proceeds to sort the combined parent and offspring populations according to the principle of non-dominance, resulting in the formation of multiple Pareto fronts (lines 5, 11–12 in Algorithm 1).
Non-dominated solutions are on the initial front, while subsequent fronts have increasingly dominated solutions. For example, the second front includes one-dominated solution, the third two, and so on. This classification process continues until all individuals have been assigned to a front.
Within each Pareto front, the solutions are further sorted using a crowding distance metric to maintain diversity (line 13 of procedure in Algorithm 1). The crowding distance is a measure of how far a solution is from its neighbours in the objective space. Solutions situated at the boundaries of the search space benefit from higher crowding distances due to the scarcity of their neighbours. By favouring solutions with higher crowding distances, the algorithm ensures that the solution space is explored in a well-distributed manner [
67].
  3.5. Selection
NSGA-II employs an elitist strategy to ensure the preservation of high-quality solutions across generations. The selection process combines the current population with the offspring based on the Binary Tournament Selection operator, as outlined in Deb et al. [
68]. This operator compares two individuals at a time, first by their Pareto rank and then by their crowding distance if they belong to the same front. The individual with the lowest rank or the largest crowding distance is selected. This process ensures that solutions with superior trade-offs and greater diversity are more likely to be selected, striking an optimal balance between exploration and exploitation (lines 16–18 of the Algorithm 1).
It has been demonstrated that the use of Tournament Selection facilitates a more rapid convergence, as described in the framework recommendation [
64]. Hence, no alterations were made to this parameter.
  3.6. Crossover and Mutation
Crossover is a process by which genes from selected individuals are substituted for the next generation, allowing new solutions to be created by merging genetic material from the parent chromosomes. The crossover probability rate determines the likelihood that the chromosomes undergo this process.
It is standard practice to include mutation as the final operator in an iteration. This operation performs unary transformations (transformations with one operand) on selected individuals. This is achieved by changing alleles in individual chromosomes. In general, the probability of using the mutation operator is fixed throughout all iterations [
41]. Typically, all genes are checked and the respective alleles are randomly changed according to a constant low probability [
41]. It should be noted that this process has some limitations. Even with the use of library routines for initialising and generating random numbers, the process presents a significant computational challenge.
Although GAs can be a great technique, there is also an experimental quality to them, as [
69] shows. There is not one formula that fits all cases for choosing these operator rates. Generally, a low crossover probability is expected to slow down the convergence process in the first iterations, and too high a probability may lead to saturation around a solution [
41].
While Genetic Algorithms can be an effective technique, they also have an experimental quality, as demonstrated by Hassanat et al. [
69]. It is not possible to apply a single formula to determine the optimal operator rates.
In general, a low crossover probability is expected to slow down the convergence process in the initial iterations. However, a probability that is too high may result in stagnation around a solution. Therefore, with the mutation operator, lower rates are typically used to avoid converting the evolution programme into a random search approach [
70].
According to the work of Hassanat et al. [
69], for these operators, several studies opt to use ranges for crossover of [0.5–1.0] and for mutation of [0.5–0.001].
In both the custom NSGA-II and the non-modified NSGA-II, the crossover operation used was the framework default. The SBX (simulated binary crossover) method was used with different crossover probabilities.
In terms of the mutation operator, in the preliminary tests it was found that while NSGA-II is efficient and yields a wide range of Pareto-optimal solutions, many of these solutions violate the problem logic, such as ensuring proper forest harvest ages or valid stand allocations. The mutation operator was then adapted to improve the probability of maintaining validity when solutions undergo mutation, as Verma et al. [
66] suggests.
By integrating problem constraints directly into the mutation process, the operator minimises the generation of invalid solutions. The process probabilistically flips bits for the alleles, while confirming whether the allele randomly chosen to undergo mutation will maintain the chromosome’s validity with regard to the harvest-age intervals or species alternatives.
This approach helps to strike a balance between exploring new configurations and maintaining validity, reducing computational waste and ensuring that the algorithm user only sees the final, valid solutions that optimise both objectives.
  3.7. Stopping Criteria and Performance Metrics
The evaluation of the quality of the solutions allows for the determination of whether the process should be terminated or continued in order to achieve the optimal results.
The preliminary tests of this module demonstrated that as the number of iterations increased, the number of solutions identified increased exponentially, but the runtime also increased exponentially. In light of the fact that the algorithm module will be available on a Web platform, the number of generations was identified as the primary criterion for termination of the process.
To assess the impact of these modifications, a comparative analysis was carried out using the same scenario and random initial population, contrasting the performance of the standard mutation operator (as required by the original algorithm package) with that of the custom mutation operator. The intention was to assess whether there are significant advantages in using a customised mutation operator to reduce the number of final invalid solutions.
A comprehensive evaluation methodology was used to compare the two approaches within the case study. The performance indicators employed for this comparison included the number of non-dominated solutions, the spacing metric, computational time, and both the evolution and interval values of hypervolume. These metrics were selected because the true Pareto front was not known a priori [
66].
The number of non-dominated solutions quantifies the cardinality of the non-dominated set within the final population, representing the solutions located on the approximated Pareto front [
71]. An elevated number of non-dominated solutions is indicative of a greater diversity of trade-offs, which is highly desirable in the context of multi-objective optimisation [
27].
The spacing metric assesses the uniformity of the distribution of solutions along the approximated Pareto front. A lower spacing value indicates a more uniform distribution of solutions, which in turn provides better coverage of the Pareto front [
27,
72].
In terms of computational performance, the runtime was employed as a metric to assess the total time taken by the NSGA-II algorithm to reach termination. Although faster algorithms are usually the preferred option, particularly given the intended deployment of this solution on a Web platform, it is important to balance computational efficiency with the quality of the solutions obtained.
The hypervolume metric was used to evaluate the quality of non-dominated solutions [
73]. This metric, calculated with a reference point greater than the maximum values of the Pareto front, quantifies the volume of the objective space dominated by the set of solutions [
64]. A larger hypervolume indicates a better approximation of the Pareto front [
74]. Additionally, the evolution of hypervolume over successive generations provides insights into the algorithm’s convergence characteristics, with a higher final hypervolume representing an optimal overall trade-off between the objectives.
The findings of this analysis are presented and discussed in the following sections, outlining the potential advantages of using a customised mutation operator for complex, high-dimensional optimisation challenges, such as those encountered in forest management.
  4. Case Study
This study evaluates the performance of the NSGA-II algorithm using a semi-hypothetical dataset for a complex forest scenario. A forest area was selected in the Coimbra District, Portugal (
Figure 3), considering the physical characteristics of the landscape. The area is divided into three types of landscapes: matrix, resilience, and conservation systems. However, only the matrix landscape type was used for the simulation. Each stand within the forest can be categorised as a pure stand (containing only one species) or a mixed stand (containing more than one species).
The forest is divided into 20 stands, 14 of which are part of the landscape matrix and classified as forest stands, making them suitable for the planning problem (
Figure 3). These 14 stands were selected for testing, reflecting the complexity of a typical real-world problem that is often large in scale. Inventory data, typically provided by forest managers, were collected for these stands. These data include essential information required by the algorithm, such as the stand’s total area and the proportion of usable area. For existing stands, the collected data included the composition type (pure or mixed), the species present (up to three species per stand), their respective proportions, and biometric parameters to estimate the yield class and the age class for each species. For new stands, the forest manager was asked to estimate up to three future alternative forest compositions and provide an expected yield class for each species. To continue the scheduling of harvests, forest managers also determined, for all alternative compositions within each stand and for each species, the minimum and maximum harvest rotation ages and whether to employ a shelterwood or a clearcut system. Additionally, they established a planning horizon of 100 years, divided into five-year periods.
A representation of the context used for the algorithm is provided in 
Table 1. This table outlines the combinations of stands, species, and harvest ages, along with the designated cutting type, as defined for two stands (1 and 9). These examples illustrate two different scenarios: one in which there is an existing forest stand (stand 1) and another in which a new stand will be regenerated (stand 9).
It can be observed that some stands exhibit a single potential combination, implying that the sole combinatorial alteration that may occur within these stands is the modification of the harvest age. However, in stand 9, the algorithm will compare multiple harvest ages for each combination and identify the optimal combination of species and harvest ages that maximises both objectives. In the final solutions for stand 9, only one of the following combinations is permitted: Stand 9-I, II, or III. This approach results in a forest management problem consisting of 437 genes per chromosome.
  5. Results
All algorithms were implemented and tested on a MacBook Air (Apple M2 CPU, 16 GB RAM) with Apple M2 GPU under the macOS Sonoma (version 14.4.1) operating system. The programming environment used Python 3.11.6, with the NumPy 1.26.3 and SciPy 1.14.0 libraries. These specifications were able to provide sufficient computational power for handling this large-scale task within reasonable time frames.
A comparison of the outcomes of the two approaches was achieved by running the basic NSGA-II and the custom mutation operator NSGA-II algorithms 10 times each for each set of parameters, given that it was not possible to maintain consistency in the initial population across all runs.
Table 2 outlines the parameters that were modified within the package used for each comparative analysis. The crossover probability range selected was [0.7–0.9], as the values between 0.9 and 0.7 did not result in any significant changes in previous test settings. The mutation rate was established to be between 0.5 and 0.001, as this interval was found to have a significant impact on the number of valid solutions generated. In general, the algorithm displays a preference for maintaining mutation rates near 0. However, in light of the mutation constraints incorporated into the custom NSGA-II, it was deemed prudent to also investigate the 0.5 range, given that previous tests had demonstrated the potential for highly significant solutions to be produced within this interval.
 The efficacy of these values has previously been demonstrated, hence their selection for utilisation in the present tests. The number of generations was distributed between 100 and 500. As previously outlined in the initial tests of the algorithm, enhancement was only apparent after 1000 iterations due to the challenge of generating valid solutions. The newly implemented improvements proved to be effective in addressing the previously identified problem. Consequently, it was determined that a 1000-iteration run would not be feasible due to the imposed runtime constraints.
The results for population size (
pop), number of generations 
(n_gen), crossover probability, and mutation probability that demonstrated optimal performance are presented in 
Table 3 and 
Table 4.
As illustrated in 
Table 2, a substantial number of configurations were tested, resulting in 72 scenario cases. Therefore, the data presented in the tables mentioned above represent a subset of the complete data. It is important to note that the standard NSGA-II results include a small number of invalid solutions, which are the result of not using the custom mutation operator. Although these instances are not prevalent, they occur.
Despite the variations in the selected case studies, several key conclusions can be drawn from the analysis of the results as a whole, rather than from the small example alone. The customised version of the NSGA-II algorithm consistently demonstrates superior performance compared to the standard version when applied to the specified problem.
The custom Mutation NSGA-II demonstrates a clear advantage in identifying a greater number of solutions. To illustrate, in the pop = 100/n_gen = 200 parameter configuration, the custom Mutation NSGA-II identifies 55 solutions (crossover = 0.9, mutation = 0.5) in 26.71 s, while the standard NSGA-II identifies only 32 solutions under the same parameter settings.
Similarly, in the pop = 200/n_gen = 500 configuration, the custom mutation NSGA-II produces 123 solutions (crossover = 0.9, mutation = 0.002) in 50 s, compared to 98 solutions from the standard NSGA-II algorithm.
The mutation correction in the custom NSGA-II effectively balances exploration and exploitation through optimised mutation and crossover rates, aiding in escaping local optima and increasing solution density on the Pareto front.
Crossover and mutation rates have a notable effect on performance: a 0.7 crossover rate generally surpasses the 0.9 rate in the custom NSGA-II. In the pop = 100/n_gen = 200 configuration, combining a crossover rate of 0.7 with a mutation rate of 0.5 results in 96 solutions, compared to 55 solutions with a crossover rate of 0.9. Lower mutation rates, such as 0.002, produce more solutions when combined with a smaller crossover rate, as they manage to find more solutions closer to the previous solution.
The custom mutation mechanism coupled with a 0.7 crossover rate strikes a superior balance between solution diversity and convergence speed, as indicated by the hypervolume metric. In contrast, standard NSGA-II compensates for its limited adaptability to exploit the solution space with higher crossover rates, such as 0.9.
While both algorithms see gains with larger populations and more generations, the extent varies. The custom NSGA-II almost doubles solution outputs when moving from settings like pop = 50/n_gen = 100 to pop = 100/n_gen = 200. Standard NSGA-II also benefits from scaling, but to a lesser degree than the custom Mutation version.
The custom Mutation NSGA-II has a longer runtime but produces a greater number of solutions. For example, with pop = 200/n_gen = 500, it finds 123 solutions in 50 s, whereas the standard NSGA-II achieves 98 solutions in 53 s. Although the standard version can be marginally faster for smaller setups, it offers less diversity in solutions.
The custom NSGA-II’s prolonged execution time is justified by its capability to produce a more varied and concentrated Pareto front. This feature is especially beneficial in this problem setting, where diversity and validity of solutions are crucial.
Regarding the spacing metric, the custom Mutation NSGA-II consistently yields lower spacing values, demonstrating better solutions distribution. For example, under the pop = 200/n_gen = 500 setup, the custom Mutation NSGA-II achieves a spacing of 0.02 across several parameter configurations, whereas the standard NSGA-II also reaches 0.02, but produces fewer solutions overall. This indicates that although both algorithms can achieve a uniform distribution, the custom Mutation NSGA-II offers greater density and diversity, thus ensuring a more complete Pareto front.
This analysis can be validated by examining the hypervolume in conjunction with the iterations. For instance, considering the parameter configurations that yield the most solutions for both the custom and standard NSGA-II: pop = 200/n_gen = 500, crossover = 0.9, mutation = 0.002. The subsequent plots relate to these configurations, noting that each parameter setup was executed for 10 consecutive runs.
In order to gain an accurate interpretation of the hypervolume metric results, it is crucial to consider not only the minimum and maximum hypervolume indices, but also the manner in which the convergence curve progresses over the course of the iterations. Although the final hypervolume values of the algorithms may appear to be similar, an analysis of the convergence behaviour across iterations provides valuable insight into the quality of the solutions.
An examination of the convergence curve for each parameterisation reveals that the customised algorithm displays a more continuous, gradual increase in hypervolume across the ten runs. This indicates a more consistent enhancement in both the convergence and diversity of solutions compared to the standard algorithm, which tends to demonstrate declines or stagnation in the hypervolume index. These observations are illustrated in 
Figure 4 and 
Figure 5. The smoother curve in the customised algorithm indicates a more stable and gradual improvement, reflecting a higher overall solution quality.
Figure 6a,b illustrate the Pareto fronts derived from the specified case study, showing the optimal balance between computational expense and the number of solutions identified. As illustrated in 
Figure 6a, the solution space is limited, with timber volumes ranging from approximately 40,000 to 65,000 and a standard deviation of approximately 12,000 to 20,000. The distribution of points is more sparse, indicating that a smaller number of solutions were identified. Furthermore, the correlation between volume and standard deviation appears to be relatively linear, with the standard deviation gradually increasing as the volume increases. The Pareto front in this case explores a narrower range of higher volumes and higher standard deviations, indicating a lack of diversity in the solutions.
 In contrast, 
Figure 6b illustrates a more exhaustive exploration of the solution space. The points in this graph are more densely packed, indicating that a larger set of potential solutions is available. In particular, the customised algorithm appears to provide more trade-offs for lower volumes, as it explores regions with reduced standard deviations.
The goals of maximising timber volume and minimising standard deviation are contradictory. Consequently, each solution on the Pareto front holds equal validity in multi-objective optimisation. In the absence of specific preference information, selecting a particular solution depends entirely on the user’s priorities. Each Pareto front point signifies a forest management strategy produced by the algorithm, as depicted in 
Table 5.