Evolutionary 3D Image Segmentation of Curve Epithelial Tissues of Drosophila melanogaster

: Analysing biological images coming from the microscope is challenging; not only is it complex to acquire the images, but also the three-dimensional shapes found on them. Thus, using automatic approaches that could learn and embrace that variance would be highly interesting for the ﬁeld. Here, we use an evolutionary algorithm to obtain the 3D cell shape of curve epithelial tissues. Our approach is based on the application of a 3D segmentation algorithm called LimeSeg, which is a segmentation software that uses a particle-based active contour method. This program needs the ﬁne-tuning of some hyperparameters that could present a long number of combinations, with the selection of the best parametrisation being highly time-consuming. Our evolutionary algorithm automatically selects the best possible parametrisation with which it can perform an accurate and non-supervised segmentation of 3D curved epithelial tissues. This way, we combine the segmentation potential of LimeSeg and optimise the parameters selection by adding automatisation. This methodology has been applied to three datasets of confocal images from Drosophila melanogaster , where a good convergence has been observed in the evaluation of the solutions. Our experimental results conﬁrm the proper performing of the algorithm, whose segmented images have been compared to those manually obtained for the same tissues.


Context
Nowadays, image processing is used in many different fields such as medical images, object detection and face recognition. This discipline focuses on the analysis and treatment of images to improve their quality and extract information from them [1]. An essential technique of image analysis is segmentation, which plays an important role on many applications from picture editors to image-based diagnostic systems.
Segmenting an image consists in dividing it into regions or objects so that it can be analysed afterwards [2]. Thus, pixels are categorised in different connecting regions depending on their grey scale. From a classic point of view, the grey scale of pixels can be categorised for edge detection or object detection [3]. The methods to detect edges use properties of the grey scale of pixels related with differences colour distinctness and texture variation [3]. These characteristics describe objects which are in boundaries of regions.
To discover other objects, methods based on the similarity of the grey scale are used, which are classified in three groups: region-based segmentation, watershed segmentation and threshold segmentation [4]. The first group consists in grouping the pixels of an image into homogeneous regions using a criterion [4]. Usually, properties such as colour, texture and intensity establish the characteristics of a pixel to be included in a region. The simplest form of region-based segmentation is region growing. In this variant, the pixels are clustered into superior regions using an existing criterion: a set of points called seeds. These points determine the characteristics of a pixel to be included in a region. For instance, an implementation of this variant is LimeSeg [5]. Alternatively, another form of segmentation based on similarity grey scale is the watershed algorithm. In this case, the image is conceived as a topographic relief where the grey scale of pixels represents the heights of the plane, which is useful to define the contour of the background or to separate objects [6].
Finally, there is another group based on grey scale similarity, called threshold segmentation. This consists in splitting the pixels of an image depending on one or more threshold values based on the histogram of their grey scale [7]. For instance, a single threshold uses only one threshold to distinguish the background of an image from their objects. If it is necessary to segment the objects in different groups, multiple threshold segmentation is applied, but it is a costly process [8].
All these methods have been applied to analyse many different types of images such as numbers, faces or objects with correct results. However, segmentation is still a challenge in biological images, specifically, their difficulties lie in their acquisition conditions, such as noise from the microscopes, low resolution and low frame rate as well as the characteristics of the objects, for example, brightness, unclear boundaries and overlapping objects [9]. An efficient segmentation method of biological images will have to overcome all these problems considering that a high level of quality and accuracy is necessary for all of them [10].
Efficient segmentation methods of biological images work well on one data set of biological images, but if they are applied to data sets with cells from different organs and tissues, their performance can be significantly reduced [11]. Therefore, they need to manually readjust their parameters for new images to obtain an accurate segmentation [12], requiring constant supervision which delays research projects. Thus, each time a different biological image is segmented, an optimisation problem appears around these values, which are dependent on the segmentation method [13,14]. Owing to this problem, machine learning approaches have been developed to generalise and automatise the process of segmentation [15].

Related Works
Machine learning is defined as the development of intelligent systems that learn to solve complex problems automatically from experience [16]. Their approaches for segmentation classify in two groups: unsupervised and supervised.
The first approach is based on finding a reference criterion that can be used to optimise the parameters of a segmentation algorithm [17]. For instance, Chakraborty et al. [18] applied a modified version of the search algorithm of the cuckoo for segmenting bidimensional microscopic images of the hippocampus in an unsupervised way. To do that, they optimised segmentation methods based on thresholding, even though they found unsatisfactory results for low levels of thresholds. In addition, Ayas et al. [19] applied the search algorithm of the firefly based on swarm intelligence to segment biological images in two dimensions. To achieve that, they also optimised a threshold segmentation. Furthermore, Zhao et al. [20] designed a novel version of mean-shift clustering algorithm to segment bidimensional images in an unsupervised way. The algorithm proposed a new online seed optimisation policy and early stopping as conditions to achieve a better computational speed.
By contrast with the previous examples, supervised approaches use a training data to learn how to segment the images [17]. The disadvantage of supervised approaches is the creation of training data, as it requires consistency among the researchers dedicated to sample and it requires a time-consuming methodology. For instance, Held et al. [21] applied different optimisation algorithms, such as genetic algorithms, hill climbing and coordinate descent to segment images in two dimensions in a supervised way, using the ground data as fitness function. In addition, Al-Kofahi et al. [22] designed a combination of a deep learning algorithm and segmentation algorithms to automatically segment biological images in 2D. The deep learning model learned to predict the locations of cells, which were used to find a threshold to initiate the seed of the watershed algorithm.
Most of the approaches to automatically segment biological images are supervised [23][24][25][26][27][28]. Frequently their applications have been limited to segmentations in two dimensions due to the fact that three-dimensional data sets are difficult to obtain in some biological fields, where the availability of raw images is limited or where the segmentation is particularly difficult [29,30]. However, the necessity of studying biological tissues in a context of three-dimensionality has been proved, for example, to unveil the presence of a novel cellular shape necessary to pack efficiently monolayer epithelial tissues in curvature conditions: the scutoids [31], or to analyse the composition of other complex tissues, such as multilayer epithelia, where tetradecahedral cells predominate [32]. Consequently, some three-dimensional biological studies cannot progress as fast as they expect, and they still need to optimise the segmentation process manually or through classical approaches, which may cause significant delays. Thus, it is vital to optimise this process automatically, to achieve automatic segmentation, which would accelerate their science.
This article presents an implementation of an evolutionary algorithm proposed to optimise three-dimensional segmentation in a semi-supervised way. Our implementation uses a three-dimensional region growing segmentation algorithm called LimeSeg, whose parameters are optimised through the measurement of segmented cells. It has been applied to three different data sets of confocal images (each data set contains several images to represent only one sample of a tissue) taken from the model organism, Drosophila melanogaster: salivary glands, embryos and egg chambers. Classically, all of them have been widely used to analyse the different biological events in which epithelial cells are involved: division, migration, apoptosis, differentiation, polarity, cancer, wound healing and much more [33][34][35][36][37]. We consider that our present work has these novelties: • A semi-supervised approach of machine learning to segment images in three dimensions based on evolutionary algorithms. • Segmented cells and their volumes can be a criterion to segment biological images using region growing in a non-supervised way. • Our tool does not require supervision of the user after it has been initiated.
The remainder of this paper is organised as follows. Section 2 summarises the image acquisition, preprocessing and segmentation processes. It also describes our evolutionary approach for automatic segmentation, providing details of all its features, so that it can be fully reproducible by third parties. Next, Section 3 presents and analyses the obtained experimental results for the aforementioned tissues of Drosophila melanogaster, including a discussion on the obtained results, considering the current context and our starting hypothesis. This section also briefly describes our future research directions, in relation to this work. Finally, the main conclusions are summarised in the last section.

Materials and Methods
The images used in this study were obtained by confocal microscopy. This technique is vital because it allows us to reconstruct images of two dimensions in three dimensions. Figure 1 presents the confocal images of each of the three tissues under study. These images which have been manually preprocessed by a pipeline of those programs: Photoshop (https://www.adobe.com/es/products/photoshop.html (accessed on 10 April 2021), Fiji (https://imagej.net/Fiji (accessed on 9 April 2021)) and MATLAB (https://es. mathworks.com/products/matlab.html (accessed on 11 April 2021)). The objective of this preprocessing phase was to prepare the images for further segmentations.
In this context, Photoshop has been used to delimit regions of the images, highlighting their white edges and black interiors. Fiji, on the other hand, was used to correct their brightness and contrast. In some cases, MATLAB was used to fix artefacts and to delimit regions more accurately.
In order to segment these images, we have used a segmentation algorithm based on region growing called LimeSeg (https://github.com/NicoKiaru/LimeSeg (accessed on 15 April 2021)), which uses a technology called "SURFEL" (SURface ELements) that considers cells as surfaces composed of particles to be delimited. During segmentation, the surfaces are attracted to two maximums: one local maximum which is related to the object, and another global maximum of the image. The first maximum forms the surfaces, while the second one causes all the surfaces to be cohesive with each other [5].
To perform a segmentation, LimeSeg needs the seeds of the cells, which can be placed using Fiji, as well as a set of four configuration parameters: • D_0: Number of pixels to be considered as one feature. Its value is measured in pixels and varies between 1 and 20. The lower this parameter, the smaller objects are detected.
• F_pressure: Gradient of pressure that is applied to all surfaces causing their expansion or contraction. Its value varies from −0.03 to 0.03. • Z_scale: Fixed value that can be calculated as physical ratio between voxel width and voxel depth. These characteristics appear in the confocal images. For instance, if the voxel with is 260 and the voxel depth is 100, the Z_scale value is 2.6. • Range D_0: Space to look for a feature whose intensity level is maximum. Its value varies from 0.5 to 10 and also depends on the D_0 parameter. For example, if the number of pixels of a feature is 2 and its value is 2, the particles will seek for a maximum in the range [−4, +4] pixels from the surface. Once the seeds have been manually placed and the parameters have been set, LimeSeg grows the surfaces in three dimensions until all of them converge. At this moment, the segmentation of each cell can be saved in a format called Polygon File Format (PLY), which describes them as a set of points in three axes and their angles. With these properties, it is possible to compute the volume or the gravity centre of a graphic object. Additionally, other characteristics such as colours or directions are included.
This segmentation method must be carried out many times by an expert until the best segmentation possible is found. The expert must decide and vary LimeSeg configuration parameters each time a segmentation is produced, taking this whole process a great amount of time.

Evolutionary Algorithm
In order to optimise the previous segmentation procedure, we have designed and developed an evolutionary algorithm (https://github.com/kapy95/3DAutomaticSegmentationTFM (accessed on 16 April 2021)), which will iteratively try to find the best parameter configuration possible for LimeSeg. This strategy of optimisation is based on Darwin's idea of the survival of the fittest, which establishes that in an environment with a population of individuals whose resources are limited, the fitter an individual is, the more chances it has to pass its genes to the next generation.
This same idea has been adapted in evolutionary algorithms [38], where different individuals represents a set of possible solutions to a problem, called a population. Each solution is generated by specific values in relation to the problem domain, which are referred to as genes. A selection process is carried out in this population based on a score established by a fitness function, which maximises or minimises a value generated from the genes of an individual. The selected individuals are used to generate new potential solutions, and the process is repeated until a stopping criterion is satisfied, such as a limited number of iterations.
Evolutionary algorithms (EA) have been previously successfully applied for twodimensional images segmentation (see Section 1). This fact, together with the ability of EAs to explore large search spaces in optimisation problems has motivated us to adapt this heuristic. The following subsections detail how the common characteristics of EAs have been implemented in this study.

Individual Encoding and Initialisation
Encoding refers to the individuals inner representation in the algorithm, where each individual describes a possible solution to the problem by its genotype (also called genome or chromosome). In order to represent image segmentations, LimeSeg parameters have been used within the genotype (see Section 2). As the Z_scale value is fixed at the beginning and remains constant for each tissue, each individual genotype is therefore made up of three real values: D_0, Range_D0 and F_pressure. From these three genes values, it is possible to obtain the phenotype of each individual, consisting in a specific segmentation for the input image.
The genotype of the individuals in the initial population are generated following an initial population procedure. Depending on the adopted strategy in this step, the algorithm may converge to different solutions. Furthermore, a suitable initial population strategy can even speed up the convergence [39].
In our context, the first generation of the algorithm has been created randomly, where each of the three genes of each individual has been randomly generated within its range (see Section 2). Furthermore, in order to have more promising individuals, an initial population of a much greater size is computed, selecting afterwards the best individuals, according to the population size, and after performing their evaluations. This initial population size has been set to 200 by default, although it can be input to the algorithm by an user defined parameter.
After the generation of the individuals, and prior to their evaluation, it is necessary to obtain their phenotypes. This implies the execution of LimeSeg per individual, which takes approximately a minute to perform the segmentation. Thus, a generation of one-hundred individuals would require one hour and a half, and it would take almost two weeks to evolve two-hundred generations.
In order to overcome this situation, the segmentation time of an individual has been delimited through a maximum time depending on its generation, as the genes of the individuals in the first generations might contain combinations of parameters leading to poor segmentations. The minimum value for the segmentation process has been set to 15 s and it has been extended through generations according to the following arithmetic series: In the previous formula, n represents the generation number, and 0.8 is the constant which increases the segmentation time. The minimum value is set to 15 s, as the first generation is generated randomly, which causes that many individuals are not correct. Alternatively, the difference has been set to 0.8 s per generation to avoid spending too much time in suboptimal solutions and reach the normal time of a representative segmentation (approximately more than a minute) in higher numbers of generations. Thus, the algorithm approaches the optimal solution step by step, ensuring that more promising solutions will have greater segmentation times.

Fitness Function
Individual evaluation is carried out through the fitness function in order to quantify how close an individual is to the optimal solution. This evaluation will be used to select the individuals within the generational change process, guiding thus the algorithm towards optimal or near-optimal solutions.
The fitness function has been designed to assign each individual a score based on its segmentation results by LimeSeg. From the output PLY files, it is possible to obtain information on the average cell volumes, as well as the number of the segmented cells. From this information, the formula for computing this score is given in Equation (2): where AV = Average volume of the segmentation and NC = Number of segmented cells. The terms in the previous formula have been defined to select the best individuals. First, this is achieved by maximising the volume, as sub-segmented solutions tend to have cells whose surface is less than the real one and, thus, less volume. Second, this is also achieved by counting the number of segmented cells and calculating the average volume of the segmentation, as oversegmented solutions tend to have few cells whose volumes are abnormally big. When this situation occurs, these cells occupy the surfaces of other cells, forcing the segmentations of the other cells to disappear.
Additionally, the average volumes have been standardised as their orders were too high and the comparison among individuals was not fair. This process of normalisation depends on the whole generation and consists in dividing the average volume of an individual by the sum of all average volumes of all the individuals in the same generation.
As a consequence, the fitness of an individual depends not only on its own genotype, but also on the overall volumes of segmentation in its generation.

Generational Change
In our approach, the first step to generate new populations is applying elitism, passing a limited number of the best individuals to the next population directly. Elitism is usually adopted in order to ensure the convergence of the algorithm [40].
After passing the best two individuals into the next population (elitism of size 2), two different classical selection strategies have been used to select the parent(s) of the rest of the individuals: roulette and tournament [41]. A probability of 50% has been set for each strategy, where in the roulette mechanism, each individual has a probability of being selected proportional to its fitness. In the cases of tournament, a tournament of size 2 has been used, with an 85% probability of choosing the best individual.
A predefined percentage of the remaining of the individuals (set by default to 25%) is generated by the mutation of a previously selected individual. The rest of the individuals to complete the population size (by default a 75%) are generated by the crossover of two selected chromosomes.
When an individual mutates to create an offspring, a new D_0 gene is always generated, but the genes of Range D_0 and F_pressure are only mutated under a certain probability (by default 30%). Their new values are generated randomly within intervals depending on the previous value of each gene: These values have been established taking into account that the new individuals must explore new values but they also have to be compatible between them. On the other hand, when an individual is generated by crossover, one of these three different possibilities is chosen under the same probability (33%): • Single or double point crossover. One-point or two-points crossover is applied with equal probability. D_0 or F_pressure gene is interchanged between individuals in the first case, while for the two-points crossover both Range D_0 and F_pressure genes are interchanged. • Single or double point crossover with mutation. Similar to the previous one, but both parents are mutated before being recombined. These mutations are carried out as aforementioned. • BLX − α crossover. Consists of generating a new individual by creating ranges of gene values from both parents [42]. To generate the intervals, two constants must be defined α (value in [0, 1]) and I (maximum minus minimum of the genes from both parents). These constants create the following intervals: In our work, α has been set to 0.1 in order not to generate too low or high boundaries, due to the high sensitivity of the parameters in LimeSeg.
After a predefined number of iterations the algorithm stops, returning thus the best individual in the final population as the best possible solution under the experimental configuration. By default, the number of generations has been set to 100. Algorithm 1 shows the pseudocode of the evolutionary algorithm. It starts with the initialisation of the population in line 1, followed by an iterative process for the search of the best possible segmentation. The whole process consist in repeatedly replace the current population with an evolved one, until certain criterion is met (lines 2 to 17). Each of this replacements is called a generational change, and previously explained. Lines 18 to 20 represent the finalisation process, where the image corresponding to the best segmentation parametrisation is obtained and returned. The next section presents the experimental results obtained by our algorithm under a certain configuration, for three different tissues.

Results and Discussion
This section presents both the experimental setup and the obtained results on images of three different Drosophila melanogaster (fruit fly) tissues. Table 1 shows the configuration of the main parameters, which have been used in the experiments whose results are presented here. They have been established experimentally but taking into account hardware limitations, as well as time restrictions. For each parameter, a study of several ranges of values has been performed, showing the best test value for each of them in Table 1. All of them have been previously mentioned in Section 2.1.3. For each tissue, we have run our algorithm using the configuration in Table 1. Afterwards, a validation process has been carried out in order to confirm the validity of our approach. In this process, the segmentation produced by our algorithm have been compared to the segmentation carried out manually in the laboratory, obtained using Lime-Seg. This validation has been carried out using the Matlab mathematical software (https: //github.com/kapy95/Processing3DSegmentation/tree/TFM (accessed on 18 April 2021)), and by reconstructing both segmentations as complete images using code from the research group Complex Organization of Living Matter Laboratory at the Cell Biology Department of Seville University (https://github.com/ComplexOrganizationOfLivingMatter/ Processing3DSegmentation (accessed on 18 April 2021)).
The Jaccard index has been used to qualitatively compare the segmentations. It measures the similarity between two sets following the next formula [43]: The process of computing Jaccard index between two images is very expensive as it consists of comparing many matrices. Having 100 generations and 50 individuals per experiment, the whole process of comparing 5000 segmentations would take several days. Nevertheless, these computations have been carried out only for validation purposes, not supposing an extra increase in the execution time of the search algorithm.
In the following, we summarise the results on Salivary glands, Embryo and Egg chamber in separate sections, depicting for each tissue three different graphical representations: The computation time of the final conducted experiments varies from approximately 71 h (for Embryo tissue) to approximately 80 h (for Egg chamber tissue). Considering that it would take almost a whole week for a manual segmentation carried out by an expert, we consider our approach to constitute a significant advance in the current 3D segmentation procedure. Furthermore, the 80 h of algorithm run time is unsupervised, meaning that the expert would be making use of this period to perform other tasks.

Salivary Gland
When running our algorithm with 100 generations (50 individual each) for the images of the salivary gland, the last best individual had a Jaccard index value of 0.93, in relation to the manually segmented image, which also received a manual postprocessing phase to fix some errors. This result represents a 93% accuracy over the original image. Table 2 shows the best individual genotype, together with the parametrisation of the manual segmentation. As it can be seen, both D_0 and F_pressure parameters present similar values for both types of segmentations. As Range D_0 determines the search space for particles, higher values will have negative effects on the segmentation computational cost. In our approach, no actions have been taken to penalize this effect, while in a manual segmentation, it is habitual to use lower values for this parameter. As it can be derived from Figure 2A,B, there exists a progressive improvement in both the average of the Jaccard index and the Jaccard index for the best individual through generations. In both figures, a significant increase in the Jaccard index is produced in the first generations, meaning that the algorithm is capable of rapidly evolve towards acceptable solutions. As Figure 2A depicts the mean of the Jaccard indexes in each generation, the improvement appears to be much greater during first generations, while the Jaccard index for the best individual ( Figure 2B) in the first generations are in a closer range. This is an expected result because the overall quality of solutions in the population improves through generations, with this improvement being much more significant in the first ones. On the other hand, the best individuals in the first generations appears to be far above the mean of the population (0.61 in the three first generations for the best individuals versus 0.09, 0.35, and 0.42 for the population average).
In Figure 2A,B, a constant evolution is observed from generations thirty-seven to eighty-one. This situation corresponds to a local optimum, from which it has been able to emerge after generation eighty-one, and then it continuous improving the overall solutions until the stopping criteria has been met. Therefore, increasing the number of generations, and, consequently, the maximum segmentation time, seem to improve the performance of the algorithm.
As Figure 2B represents individual Jaccard indexes, it can be better appreciated than other situations in which the algorithm has been trapped in local optima, but of shorter length than the one previously commented. It can also be seen a loss of quality between solutions of consecutive generations. This is due to two different facts: on the one hand, the Jaccard index has no direct relation with the individual fitness value. On the other hand, although elitism has been applied, the fitness value is computed for each individual tacking into account not only its genotype, but the genotypes of all the individuals in its generation. This may produces situations in which the same individual in two different generations may have different fitness values.
From the best individual genotype in Table 2, the graphical representation of its segmentation has been depicted in Figure 3B, where it has been compared with the manually obtained segmentation ( Figure 3A). As it can be seen in this figure, the obtained segmented image is almost identical to the manual segmentation with the principal difference that the boundaries of some cells are not the same, but the compactness of both solutions is similar. This may be caused by the difference in the F_pressure parameter which regulates the expansion of the cell boundaries. In view of the presented results, and together with the approval of the experts in manual segmentation in the laboratory, the obtained segmentation of this type of tissue can be considered satisfactory in order to carry out further biological analysis.

Embryo
When running our algorithm with 100 generations (50 individual each) using the images of embryo, the last best individual had a Jaccard index value of 0.74 in relation to the manually segmented image, which also received a manual postprocessing phase to fix some errors. Although this result may seem quite poor, in relation to the previous tissue, experts have confirmed that this is an image particularly difficult to segment. Table 3 shows the best individual genotype, together with the parametrisation of the manual segmentation. Range D_0 parameter is set to 2.0 in the manual segmentation for all experiments as the default value. Nevertheless, our algorithm always seems to obtain higher values for this parameter, allowing thus to explore a larger space of values. Regarding D_0 and F_pressure parameters, they correlate to find good solutions. In the manual approach, a lower value of D_0 together with a higher value of F_pressure have been used, in order to find solutions locally but with higher pressure. On the other hand, our automatic approach obtained values to find solutions globally and with less pressure. Both approaches are complementary but equally valid. Table 3. LimeSeg parametrisation for both automatic and manual segmentations of embryo. Automatic parameters correspond to those obtained by the best individual of the algorithm after 100 generations of 50 individuals each.

Gene/Parameter
Automatic Parametrization Manual Parametrization  Similarly to the previous experiment on salivary gland, it can be observed in Figure 4A that there exists general upward trend in both charts, being more pronounced in the first generations. The algorithm appears to be trapped in a local optimum from generation forty-five to generation eighty-eight. This situation is specially evident in Figure 4B, where, after evolving from a best Jaccard index of 0.67 in generation forty-five to a Jaccard index of 0.74 in generation ninety-six, it appears to reach another local optimum stage, until it reaches the stopping criteria. Presumably, if the number of generations in the stopping criteria were increased sufficiently, the algorithm would be able to find better solutions. As it would be at the cost of increasing the execution time, it becomes a problem to find a compromise between the quality of the obtained solution and the amount of time we are willing to wait to get it.
During the first generations, it can also be observed a difference between the mean and best Jaccard indexes in Figure 4A,B, respectively. While the best solution in the initial population has a Jaccard index of 0.26, the mean of the Jaccard indexes on the same population is 0.08. This difference diminishes as the generations evolve, being the best individual Jaccard index close to the average in generation 18.
From the best individual genotype in Table 3, the graphical representation of its segmentation has been depicted in Figure 5B, where it has been compared with the manually obtained segmentation ( Figure 5A). In this comparison, differences are much more evident than the differences presented by the segmentation of the salivary gland in Figure 3. A lack of compactness is especially observed in the segmentation of the algorithm. This is possibly due to the either the parameter F_pressure , since the volumes of the cells are similar, but they do not seem to have been sufficiently expanded. Furthermore, the height of some cells is not the same in the best individual segmentation and in the manual segmentation, and some cells are missing in the best individual segmentation. Both problems may be related to the difference in the D_0 parameter, which could cause that some pixels are not clustered correctly.
Although the obtained segmentation may not be useful as it is, nevertheless it can be used as a starting point for a further manual postprocessing in order to increase the quality of the segmentation. Therefore, for those images that are more difficult to segment, our algorithm is able to produce a segmentation of sufficient quality that can be further improved manually, thus saving a great deal of time for the laboratory expert.

Egg Chamber
After running our algorithm with 100 generations (50 individual each) for the images of the egg chamber, the last best individual had a Jaccard index value of 0.86, in relation to the manually segmented image, which also received a manual postprocessing phase to fix some errors. This represents an intermediate value between salivary gland and embryo, under the same experimental configuration. Table 4 shows the best individual genotype, together with the parametrisation of the manual segmentation. As well as for embryo, our approach obtains a higher value for D_0 and a lower value for F_pressure than the manual segmentation. This way, the search is performed more globally and with less pressure, although the difference is smaller in this case. The search space, defined by Range D_0 is also wider than in the manual segmentation, also not as extensive as in the two previous experiments, as both D_0 and Range D_0 have lower values. Table 4. LimeSeg parametrisation for both automatic and manual segmentations of egg chamber. Automatic parameters correspond to those obtained by the best individual of the algorithm after 100 generations of 50 individuals each. Similarly to previous cases, Figure 6 shows the evolution of the quality of solutions through generations, both the mean of the Jaccard indexes of all individuals in each generation ( Figure 6A) and the Jaccard indexes of the best individual per generation ( Figure 6B).

Gene/Parameter
In this figure, we can observe a clear improvement in the quality of the solutions by generation. In general, the trend of these graphs is better than the ones for the embryo images, under the same experiment. Although the best value is under the one obtained of the salivary gland images, in this case it can be seen that the algorithm has not been trapped in a local optimum for as many generations. More specifically, in Figure 6B it can be appreciated two different local optimum stages, the first one from generations twentyeight to thirty-nine, and with a best Jaccard index of 0.72. The next stage corresponds to generations forty-two to forty-six, having a best Jaccard index of 0.76. Finally, the best individual during the last five generations also present the same index, being equal to 0.86, the final reported value. Initial population presents a better Jaccard index average for this tissue, as well as a more rapid convergence to the best Jaccard index per generation. Starting from a 0.13 average value in the initial population, it rapidly evolves to a 0.40 average value in the second generation, as it can be seen in Figure 6A. This means that for egg chamber tissue the algorithm is able to evolve to generations with better solutions overall.
From the best individual genotype in Table 4, the graphical representation of its segmentation has been depicted in Figure 7B, where it has been compared with the manually obtained segmentation ( Figure 7A). In this comparison, it can be seen that the resemblance is very similar and reliable, but some cell boundaries are not correct and some cells are slightly over segmented. The differences in the cell boundaries are not so easy to notice as the ones in the salivary gland. In the same way as the previous comparisons, the differences of the cell boundaries may be caused because of the difference in the F_pressure parameter, and the oversegmentations may be caused due to the differences in the D_0 parameter, as some pixels are clustered incorrectly inside some cells. Nevertheless, in general terms the segmentation is correct, representing thus an acceptable segmented 3D image for egg chamber tissue. When comparing the obtained segmentation with others obtained under different experimental parametrisation for the same tissue, it is possible to observe certain cells which were not present in the others. This confirms both the validity of the obtained segmentation and the experimental configuration of the algorithm.

Discussion
Image segmentation is an essential task in biological research projects, as it allows biologists to measure properties of the cells, such as their organisation or packaging. Specifically, it is crucial to segment images in three dimensions, in order to continue describing complex cellular processes. The process of segmenting these images can be costly when supervised algorithms lack of training data, meaning that it can require an expensive manual optimisation of parameters that can take days or even weeks when carried out for three-dimensional images. We consider our work to be pioneer in this field since biological image segmentation has been automatically optimise mostly for 2D images. This section has presented the experimental results for an automatic optimisation of LimeSeg 3D image segmentation process for three different tissues of Drosophila melanogaster.
LimeSeg is an algorithm to recreate cells in three dimensions after the user has introduced a set of parameters, which consists in a set of points points called seeds, and constants of forces to be applied in the seeds. The methodology used in this work automatically optimises a large part of the process of the user. Specifically, we have optimised almost all parameters of LimeSeg through the application of an evolutionary algorithm, which only requires the position of the cells and the images to be segmented. This algorithm uses cellular properties to perform the optimisation.
As a result, we have obtained three-dimensional segmentations of different images of Drosophila melanogaster, where a good convergence has been observed in the evaluation of the solutions, for all the tissues. Best solutions have a good quality in general when compared with the manual segmentations, where better results have been obtained with greater numbers of generations. These facts seem to indicate that the application of this algorithm is satisfactory for the optimisation and automation of three-dimensional images segmentation.
The automation of the segmentation process carried out by our approach is not complete as a set of manually determined seeds representing the nucleus of the cells is needed as a starting point. However, the time cost of this seeds selection process is negligible compared to the selection of the segmentation parameters carried out by the evolutionary algorithm. Without the automation, this process must to be achieved after a trial/error-based manual optimisation.
On the other hand, the variability in the quality of the solutions for the different tissues indicates that the fitness function does not adjust in the same way to all of them. Still, the evolution of the best individuals keeps increasing over time, despite fluctuations in the average Jaccard index. This may be caused by the different quality of the images and their membrane staining. For example, in the images of embryo, the inner membrane is not stained. Therefore, the seeds tend to overgrow, and therefore they occupy other cell surfaces, giving a poor Jaccard index in the embryo. To increase the quality of some solutions, some of the configuration parameters in the algorithm may be varied, such as the number of generations (see Table 1). However, the increase of this parameter entails a significant increase in computational time. Additionally, an improvement in the fitness function definition would have the most significant impact on the quality of solutions, as it is detailed in the future work section.
The quality of our solutions is lower than the quality of supervised segmentations, but we do not need training data, which is the reason for that difference. Furthermore, in comparison with classic approaches, our method generates good segmentations for three different data sets, which is very difficult to obtain for classic approaches without modifications.
Taking everything into account, we consider our work to be a significant contribution to the automation of 3D biological images segmentation, being able to obtain acceptable solutions in a considerable time. Although the obtained segmentations cannot be considered as final for all tissues, they certainly constitute a first approximation than can be further manually improved.
Another conclusion derived from this work that is worth highlighting is the importance of cell volume in three-dimensional segmentation, being the property that most positively contributes to the fitness evaluation. Therefore, this characteristic will centre our efforts in the future work.

Future Work
This section presents our current and future lines of work, mainly dedicated to optimising both time performance and the quality of the obtained results, trying to find a comprise among them. They include, but are not limited to, the following.

•
Initialisation process: As aforementioned, a set of manually determined seeds is needed before the creation of the first generation of solutions. In this sense, we plan to also automate the location of these seeds, so that this process is not dependent on expert supervision. To do this, it would be necessary to add a preprocessing phase that detects the positions of cells through image processing. Those positions would be passed to the evolutionary algorithm to generate the seeds, where their sizes would be optimised. • Fitness function: In order to improve the individuals assessment within the fitness function, it would be necessary to add more information to this function. The use of cells volumes to avoid oversegmentations may oversimplify the information. To avoid this situation, we intend to focus on each cell to be segmented, as in the application of the fruit fly algorithm in [19]. An interesting approach to explore would be estimating the volume of each cell through image processing, which could be used to more accurately measure cells volumes. Furthermore, due to the current costly individual evaluation, we also plan to study and incorporate some additional procedures to speed up this process. In particular, we plan to analyse Long-Term Memory Assistance [44] and surrogate models in evolutionary single-objective optimisation [45]. • Individual segmentation time: Our algorithm uses a maximum segmentation time per individual which might not be enough, explaining thus the improvement in the quality of the solutions with increasing generations. This time limitation was introduced to control the whole execution time to some extent. Nevertheless, we expect that a revision of this parameter could increase the convergence of the algorithm, trying to adjust it to each individual inner characteristics. • Further optimisations and integration: In order to speed up the whole process, we plan to adapt our algorithm to be executed in parallel, and also in graphics processing units (GPUs). Additionally, even though our approach is freely available upon request, we intend to integrate it into a cloud computing service, such as Google Colaboratory. This way, the final user would not need to install anything on their computer, and the computational time would not be dependent on the user's specific hardware. • Study of other tissues of confocal images: We plan to enhance our work by carrying out experiments for the automatic segmentation of other tissues such as the tissues in the training data sets (https://github.com/NicoKiaru/TestImages (accessed on 25 April 2021)) of LimeSeg [5] and the tissues in the training data sets (https://osf.io/uzq3w/ (accessed on 25 April 2021) of PlanSeg [46].

Conclusions
In this paper, we have presented a new evolutionary algorithm for performing automatic 3D biological image segmentation. We consider it as significant progress in this field due to the importance of image segmentation in biological research projects, together with the current lack of automatisation of the whole process for three-dimensional images, and the consequent experts' time investment. Our implementation makes use of LimeSeg, a three-dimensional region growing segmentation algorithm. LimeSeg has been integrated into our process, where the objective is to optimise its configuration parameters throughout generations. For this purpose, an initial set of seeds is needed as a starting point. Although these seeds may be manually determined, the time cost of their selection is negligible compared to the LimeSeg parameters selection.
Experimental results on three different tissues of confocal images from Drosophila melanogaster confirm the validity of our approach, where the obtained segmented images have been compared to those manually segmented, both visually and by means of the Jaccard index. The similarities between both images are remarkable for all tissues, where better solutions have been found when a larger number of generations have been used. However, the increase in this parameter entails a significant increase in the computational time. We consider that our approach constitutes a first step towards the complete automatisation of 3D image segmentation, presenting very promising results.