1. Introduction
Diatoms are an ecologically important class of unicellular algae that can be found in almost all aquatic systems and in terrestrial environments. Diatoms have a hydrated silica exoskeleton consisting of two halves of slightly different sizes, the hypotheca and the epitheca. The epitheca overlaps the hypotheca. Each theca consists of a valve, which varies in shape depending on the species, and a number of associated silicate girdle bands. A detailed description can be found in Round, Crawford and Mann [
1].
Diatoms reproduce predominantly by vegetative division. As early as 1869, Macdonald [
2] and Pfitzer [
3] recognized in the diatoms they studied that during such cell division, the epitheca of the mother cell becomes the epitheca of the larger daughter cell and the hypotheca of the mother cell becomes the epitheca of the smaller daughter cell. For each theca of the mother cell, a slightly smaller theca is formed. This results in one cell the size of the mother cell and one smaller cell. This rule has been confirmed in many species (e.g., [
4]), but exceptions are also known [
5]. When a minimal size is reached, sexual reproduction occurs, resulting in one or more large initial cells. There are also species that can restore their size by vegetative enlargement (see [
6] for a review).
Macdonald and Pfitzer assume in their considerations that there is a uniform doubling time. The larger and smaller daughter cells divide simultaneously and therefore the number of diatoms in each generation doubles. Such a division behavior of a species, which follows the McDonald–Pfitzer rule, is referred to here as the P-model.
In the following, the generation is numbered with the index
n, whereby the development should begin with a single diatom at
. In the literature, there are also generation counts starting with 1. In the context of cell size reduction for the morphological category of pennates (bilaterally symmetrical), the size of a diatom is understood to be the apical length. In the morphological category of centrics (radially symmetrical), the diameter of the circumcircle can be used. This length is characterized by a size index
. The value
corresponds to the cell of maximum size. The size index is incremented with every size reduction step. Each division of a cell with size index
k produces a cell of the same size (size index
k) and a cell of the next reduction level (
). With continued divisions, the smallest cell in the population under consideration reaches the maximum value
, at which a cell enlargement becomes necessary. Using the notation
for the number of diatoms with the size index
k in the
n-th generation in the P-model it follows from the Macdonald–Pfitzer rule for
:
It is known from Pascal’s triangle that:
Without taking mortality processes into account and for diatom sizes above the threshold for the necessity of cell enlargement, the size classes are given by binomial coefficients. An early study applying this result to diatoms can be found in Richter [
7].
Otto Müller observed a delayed vegetative division in certain diatoms in the chain-forming
Ellerbeckia arenaria (Moore ex Ralfs) R.M.Crawford 1988 (basionym
Melosira arenaria D.Moore, see [
8]). He names the epithecae with the letter
f and the hypothecae with
u, so that a diatom in a chosen horizontal position is named by a pair
fu or
uf. In his publications from 1863 [
9] and 1864 [
10], he distinguishes between valves with and without thickening for the species under consideration (
Figure 1a) and places a small circle under the letter if the valve has no thickening and a line otherwise so that the combinations shown in
Figure 1b and their mirror images result. For details of the morphology, see [
11].
Figure 2 shows an illustration of a chain from [
10]. The girdle bands are marked by red lines. Each is connected to the larger valve of the diatom and points to a neighboring thickening.
If two diatoms in the chain have their larger valves next to each other, their girdle bands will point away from each other. There are no girdle bands that cover the valves of the two groups at their point of contact, which makes them visually separate. Otto Müller was able to show that this divides a chain into groups of two diatoms (sets of twins) or three diatoms (sets of triplets). In
Figure 2, these groups are combined by horizontal brackets placed above the chain.
In all cell divisions, Otto Müller observed the transitions shown in
Figure 1b. The valves of the parental cell keep their structure. For details, please refer to [
9,
10,
12]. Regarding temporal development, it is essential that there is a delayed division, which Müller shows exemplarily for a chain of three diatoms (
Figure 1c). According to the rule in
Figure 1b, he deduces from this sequence back to a single starting cell
, whereby the smaller daughter cell
, which has inherited the hypotheca (
becomes
) from the mother cell, skips a division. In the next generation step, it divides again. According to Müller, such a skipping of one division of the smaller daughter cell occurs in all structures shown in
Figure 1b. This implies asymmetry in the division behavior. Such behavior is referred to below as the M-model. Based on the illustration in [
13],
Figure 3 shows the first three subsequent generations of the development of a diatom chain starting with
.
The valves of newly formed cells that will not divide in the next step, i.e., are in an interim state, are indicated by blue valves. The gray shapes between the generations visualize the mapping of mother cells to daughter cells. The numbers in these figures are the size indices of mother (top) and daughter cells (bottom), assuming that development began with an initial cell. Due to the skipping of divisions, the number of diatoms does not grow exponentially. This will be discussed in more detail later.
The third generation consists of a set of triplets followed by a set of twins, each with overlapping girdle bands. When drawing the development for the following generations, it becomes clear that the entire chain consists of these two groups.
A surprisingly different observation was made by Laney, Olson, and Sosik [
14] on
Ditylum brightwellii using time-lapse imaging. They report that the daughter cell that inherits the hypothecal frustule divides in each generation. However, the division of the larger daughter cell is delayed in comparison. This model is subsequently denoted as the L-model. In an idealized approach, in which exactly one division is skipped, the L-model and the M-model form a pair of opposites regarding division behavior. For further details, particularly on the temporal behavior of the development, see [
13,
15].
Chronological sequences of patterns, such as those shown in
Figure 1c and
Figure 3, are referred to as generations in the literature cited (e.g., [
13]), although not all diatoms divide in the time interval between these snapshots but skip a doubling period. This convenient counting method will also be used here. However, restrictions and limitations of this representation should be pointed out.
A description of the processes, as given by O. Müller [
9,
10], is implicitly based on the assumption of equal doubling times, whereby the doubling time is understood as the period between successive vegetative cell divisions. In the case of cells that skip a division, the double interval is assumed. It is known that doubling times and thus also the generation time, which is defined as the mean doubling time, depend on environmental parameters such as light intensity, nutrient concentration or temperature (see, e.g., [
16]). Even under constant environmental conditions, the doubling time is a random variable that has a natural statistical range of variation and can be described by a density function [
17]. In the case of the M-model and L-model, there are therefore two distributions, one for division without delay and one for division with delay.
The stochastic character of the doubling times leads to a temporal drifting apart of the divisions in a culture. When starting a clonal culture with a single cell, after many divisions you will encounter a wide variety of division stages occurring simultaneously. For the applicability of the given considerations and the visualization of the generations of a chain, it is essential that the width of the distribution, e.g., measured as the standard deviation, is small compared to the generation time. The smaller this ratio is, the longer a real colony as shown in
Figure 2 can grow, which corresponds to the theoretical structure resulting from the stepwise application of the rules in
Figure 1b. Observations and quantitative considerations on the synchronicity of divisions in diatom chains can be found in [
18,
19].
It should be mentioned that cell divisions of diatoms can be synchronized by varying environmental parameters periodically. This can be achieved by periodic changes between phases of high and low light intensity [
20,
21,
22,
23]. The periodic addition of silicate to colonies that are kept under silicate deprivation can also be used for synchronization [
24,
25]. Provided they can be successfully applied, these methods offer the fundamental possibility of studying chain-like diatom colonies over many generations.
A Lindenmayer system (L-system) is a mathematical formalism developed by Aristid Lindenmayer that is particularly suited to modeling the growth processes of plant development [
26,
27]. It is also a useful tool in the context of these considerations. An L-system is a triple consisting of:
The alphabet is a set of letters. The term “letter” is broadly defined, as a letter that can also contain indices and thus numbers.
The formalism starts with the initial string ω, a sequence of letters of the alphabet. The replacement rules are applied to this string. By replacing individual parts (individual characters, substrings), a new string of letters of the alphabet is obtained. The application of the replacement rules is continued iteratively so that a sequence of such strings is created. In the following, a generation of a chain-like diatom colony is simply understood as the associated string.
Koster and Lindenmeyer successfully used an L-system to describe the positions of heterocysts in a filamentous colony of the cyanobacterium
Anabaena catenula [
28]. As a one-dimensional system is described here, the term “one-dimensional L-system” is used. L-systems allow the treatment of structures of plants with complex branching, which enables the generation of computer-generated images of plants [
29]. In the context of diatoms, a one-dimensional L-system can be used to determine the position of connection points in zig-zag-shaped chains of
Diatoma vulgaris [
19]. Of importance for the study presented here is the fact that the size sequence of diatoms in a chain-like colony can be modeled by an L-system. This approach is described by Ussing et al. [
30]. The authors assume that the Macdonald–Pfitzer rule is applicable and that all cells divide synchronously. The resulting sequence was analyzed in [
18].
It is remarkable that Otto Müller [
9] defined an alphabet with the abbreviations
and specified substitution rules (
Figure 1b). These can be understood not only at the level of pairs of letters of an alphabet, but also as replacement rules for single frustules. The only missing extension for a complete description as an L-system is one that allows the delayed division shown in
Figure 1b to be included in the calculation scheme.
The structure of a chain-like colony whose diatoms divide according to the McDonalds–Pfitzer rule; in particular, the size sequence of the diatoms and the number of diatoms of a certain size at a given generation obviously depends on the model of division. In this work, the properties of such chains and the distribution of sizes will be studied for the presented models. The background to this question is the possibility of investigating the division behavior of a species under consideration. Direct evidence of delayed division requires an observation period on the order of several generations. It should be borne in mind that the question arises in the case of chain-like diatoms, other diatoms living in colonies and individual diatoms. Motility complicates observation. On the other hand, easily testable criteria require the fulfillment of certain conditions, in particular sufficient synchronicity of division processes and the negligibility of death processes within the examined sample.
3. Results
3.1. Alternatives to Long-Term Observations
When investigating the population dynamics of a diatom species in particular, the question of the validity of the McDonalds–Pfitzer rule arises. If this is fulfilled, which we will assume in the following, a simple division scheme without delay, one of the two models presented, or another scheme could be possible. An obvious option is to use long-term observation, which provides immediate information after a few division processes. This requires a sample that is vital over the observation period. A sufficient supply of nutrients and adequate light exposure must be ensured.
Long-term observations can also provide information without observing the individual division processes. The growth of a culture under consistently good conditions, i.e., in particular without depletion of nutrients and without decreasing exposure to light, can provide an indication of whether exponential or Fibonacci growth is present after measuring the size of the population. Ideally, the culture should be started with a single cell. To measure the population, an automated method using image analysis is recommended. A counting chamber can also be used for this purpose. Alternatively, biomass can be determined if samples are large enough. Whether the species forms chains is not important. As the Fibonacci sequence grows asymptotically exponentially, differentiation becomes more and more difficult as the number of generations increases. Measurement of the growth rate does not differentiate between the M-model and the L-model.
Other results already presented offer possibilities for analysis, especially on non-living samples and preparations. They therefore do not necessarily require cultivation of the species. The methods are explained below. They mainly refer to chain-shaped colonies. Synchronicity of cell divisions is only to be expected locally in such a chain, as it is lost after a few divisions among its descendants. In the tree structure of the divisions, starting from an initial cell, there are inevitably neighboring cells whose common ancestor dates back so many generations that synchronicity of their divisions is very unlikely. On the other hand, there are many sections that are sufficiently synchronous to allow analysis using the methods described. If we assume that the distribution of doubling times is so narrow that synchronicity over 5 generations is almost always given [
18], a chain of at least
diatoms is created in the P-model, which is likely to be consistent with theory. In one of the models with Fibonacci growth, there are only
diatoms. Regardless of whether size differences, orientations or sets of twins and sets of triplet structures are examined, a loss of synchronicity must be expected for shorter chains. The shorter the range of a match with a theoretical sequence, the more likely it is that it is a random coincidence and the less informative it is. On the other hand, it can be seen that the sequences for the different models do not have longer matching sections, so that in the case of synchronous sections a short sequence of recorded properties is sufficient for an unambiguous assignment.
If the synchronicity is not preserved across the entire recorded chain, it is not easy to identify where the synchronous subsections are located. Calculating the cross-correlation between the recorded and theoretical sequences allows us to find the closest match in such cases. In the case of sequences of alphanumeric characters, these are first replaced by numbers for the purpose of calculation. When this is performed for different models, it provides a measure of how well the test sequence matches the theory. In addition, matching sections can be identified. If no such sections are found, the synchronicity may persist for only a few cell divisions. Alternatively, neither model may be correct.
3.2. Testing the Applicability of the Müller Model
A detailed description including the girdle band structure in chain-shaped colonies was only given for the Müller model. Whether this correctly describes the situation in a sample can be checked by placing an existing fragment of a chain in a defined horizontal position and expressing the visible structure as a string using the alphabet (3). If we restrict ourselves to the four types of the alphabet and their mirror images, i.e., ignoring the sizes, we obtain a sequence of orientations and thickenings. As it is not known at which point in the generation sequence the observed string occurs for the first time, it must be compared to the theoretical sequence according to (18) for a sufficient length. If there is no match, one must also search for the mirrored string in the theoretical sequence, as the chain has an orientation that is caused by the orientation of the initial sequence. Only a match is relevant, as there can be various reasons why no match is found. The model may be inappropriate, the recording may be incorrect or there may be a loss of synchronization. This statement applies analogously to all comparisons of recorded sequences with theoretical sequences.
The grouping of a chain into sets of twins and sets of triplets is based on the girdle band structure. It is easy to recognize visually. The procedure is analogous; the comparison is based on the alphabet and development as described in
Section 2.1.9.
3.3. Analysis of Sizes and Size Differences
There is not necessarily a strict relationship between delayed cell division and the occurrence of thickenings, and the girdle band structure as described by Müller. To differentiate the division models without delay with delay, where the smaller or larger daughter cell can divide with delay, it is useful to examine the size sequence. However, it is difficult to assign a size index to measure absolute sizes. The reasons are as follows:
There is natural variation in size even within the initial cell. This means that at best it is possible to give an interval for the size index in which the measured diatom is likely to fall.
The decrease in the mean size per generation is a non-linear function of time. Size, plotted against the size index, decreases most rapidly in large diatoms [
33,
34]). This makes classification difficult.
As these data are not systematically collected, they would need to be collected as a first step.
Although it turns out that incorrectly assigned size indices shifted by a fixed number also produce sequences that exist in the theoretical size sequence, an alternative is to examine differences between sizes and size indices. In this case, no uncertain large numbers appear in the sequences, but only the small differences. Even with this approach, differences in size must be assigned to differences in size indices.
As mentioned above, in the P-model the size indices between neighboring diatoms differ by only ±1 [
18]. The frequencies of the length differences are then present in two peaks, with the scatter increasing with sample size due to the non-linear dependence of the size on the size index. To avoid errors due to loss of synchronization, it is important to ensure that the division stages are close together. The value range for the M-model additionally contains zero and, in the L-model the values +2 and −2 are included. This makes it possible to distinguish between the three models presented without analyzing the sequence, but it does not allow us to say whether another model describes the colony better. The advantage of the method lies in the simplicity of its procedure. As can be seen from the chains (26) and (40), all possible values occur even in short chains.
If a longer fragment of a chain-like colony with similar division stages is available, a comparison with the difference sequences according to (25), (27) and (39) is recommended. The significance of a match with one of these sequences increases with the length of the matching sequence. Once a fragment is analyzed and a different value of the size indices is assigned to the size differences in neighboring diatoms, we look for this pattern or its mirror image in the calculated sequences. This sequence is similar to a part of a fingerprint being assigned to a location in a complete image. Since the difference sequence is self-similar, this assignment is not unique. If a sequence appears for the first time in a particular generation, it will also appear in subsequent generations due to the structure of the recursions. In all recursions, the number of fingerprint matches increases with the generation number, as a particular pattern is repeatedly appended and mirrored.
The feasibility of the method was demonstrated on
Eunotia sp., probably
Eunotia glacialis Meister 1912 in [
18]. In this species, the length differences for adjacent size indices are around 1 µm and are therefore easy to measure. A colony of 25 diatoms was analyzed and found to be consistent with the P-model theory.
3.4. Analysis of the Orientations
As shown, the sequences of sizes and orientations can be considered separately. It is useful to use the orientations of a clonal diatom chain as a fingerprint. The advantage of this is that it is not necessary to measure lengths quantitatively, but only to identify and record which diatoms have the larger valve on the left (l and L) or right (r and R). Since it must be assumed that it is not possible to distinguish visually between l and L or r and R, these characters are treated as identical when comparing the sample and the theoretical sequence. In a P-model, only the letters L and R exist anyway. It has been mentioned that they have alternating orientations.
3.5. Analysis of the Size Distribution
Measuring the lengths of many diatoms and classifying them in a histogram with subsequent comparison with theory takes a special place among the possibilities presented:
One is not dependent on longer chain-like colonies but can also analyze short chains of a few diatom species, as well as diatoms that separate immediately after division.
With an increasing number of diatoms, i.e., with a high generation index n in the sample analyzed, synchronism becomes less and less important. The reason for this is that the curves for the functions , and are similar as a function of k for closely spaced n (see below).
However, there are limitations and challenges:
A culture started with a single diatom should be used as the basis, otherwise there will be a superposition of several distributions for the respective starting size.
The theory gives the number of diatoms for a size index, not the size. The comparison of the measurement to the theory therefore requires the assignment of the size to the size index.
To compare the curve shapes, you do not need an exact assignment to a size index, as would be necessary for the comparison with the size sequence of chains, but a function as shown in [
33,
34] should be available to equalize the diagrams. If we consider the size indices at a given generation in a model without delay, then the theoretical frequency is given by (2). The binomial coefficients at a generation
n, i.e., at a chosen measurement time, are symmetrical with respect to
k. Apart from the fact that the size index 0 corresponds to the maximum size, this is not equivalent to a histogram of sizes due to the non-linearity mentioned above. However, symmetry is an important distinguishing criterion for the division models. To illustrate the characteristics of the distributions, the number of diatoms in the M-model and L-model are shown in
Figure 8a for
over
k. The two models have the same number of diatoms in each generation so that they can be visualized at the same scale.
In order to visualize the P-model together with the other models, one could limit oneself to normed values. To emphasize the quantitative difference in the entire population, a logarithmic representation was chosen in
Figure 8b for values
and all three models. The uneven heights in the L-model are characteristic, although they are less noticeable in the logarithmic representation. These discontinuities remain characteristic in later generations.
Figure 8c corresponds to
Figure 8a but has been calculated for
. Although the maxima for
. are about 6 orders of magnitude higher than those for
, the similarity of the bar charts is striking. The corresponding logarithmic representation is shown on the right in
Figure 8d.
The position of the maximum in the M-model is clearly shifted towards smaller size indices compared to the other models. However, the delayed formation of the larger daughter cell in the L-model does not lead to a corresponding shift towards larger indices. The maxima of the curves are at slightly lower k values in the L-model compared to the P-model.
3.6. Maximum Size Index
As explained at the beginning, no limitation of the size index has been introduced. However, there is a minimum or maximum size index
at which cell enlargement becomes necessary. Whether this occurs through sexual reproduction, as is usually the case, or through vegetative reproduction depends on the species. A certain natural range of variation is presumably also given here, so that
is not necessarily a well-defined value. In the context of size distributions, it should merely be noted that a certain size index
is reached much earlier with the P-model compared to the M-model. The division behavior not only reduces the reproduction rate but also leads to a higher proportion of diatoms having a large valve size after the same period of development, i.e., the same number of generations. Species that divide according to the M-model significantly delay the onset of cell enlargement. As already discussed with reference to
Figure 8, an analogous statement does not apply to the L-model, because the courses of the L-model and P-model are only slightly shifted with respect to each other.
4. Discussion
In view of the restriction of this mathematical modeling to the division process, the question arises as to the limitations and significance of the results presented here. Models for the development of diatoms, including their size, can have different objectives. Models that are intended to describe the dynamics of the population over a longer period of time must include death processes as well as regularly occurring cell enlargement [
13,
15,
34].
As the focus here is on the possibility of differentiating between the division models, only the growth phase starting from a single diatom is considered. The applicability is limited by the period of Fibonacci growth or, in the case of the P-model, exponential growth. This implicitly means a negligible mortality rate. This statement applies to the study of a fragment of a chain colony as well as to the size distribution of an entire culture.
Most of the concepts presented require a clonal chain-like colony. The need for a sufficient number of synchronous divisions in such samples was pointed out. Looking at the size distribution of a culture that has grown from a single diatom does not have these difficulties, because the characteristics of the distributions are almost independent of the generation. Loss of synchronicity means that the number of generations passed through is not identical but does not change the basic forms of the curves.
If it is not possible to assign a sample to any of the models considered, this may be due to the limitations of the method or its applicability to the sample (e.g., loss of synchronicity), but it could also indicate other possibilities that occur in nature. There is probably no principle that would prevent more than one generation from being skipped during cell division, or the larger and smaller daughter cells from dividing alternately with a delay. Delays of fractions of a generation time or random processes cannot be ruled out either. For the investigation of so far unknown division rules, direct observation of the processes is the method of choice.
If you want to describe the development of the system based on the division behavior of the models, taking into account the deviations from a fixed doubling time, you can add a timestamp to each letter of the alphabet to indicate when the last division occurred. The time intervals between successive divisions are represented by a probability density function. A real system can be visualized as a tree, with division times plotted along a time axis. Such trees can be generated by a Monte Carlo simulation. After a division has taken place, the time of the next division is determined for each daughter cell using the probability density function. The division rules are taken into account according to the model under consideration. The generation index is also increasing when intermediate states are reached. The generation indices of the existing diatoms are therefore known for a specific point in time. Adjacent diatoms with the same generation index belong to one of the sequences derived here, regardless of which property is considered. The lengths of syn-chronons segments and their size distribution can be determined. From a large number of such randomly generated trees, statistical information about the lengths of synchronous segments can be obtained. The frequencies of lengths of synchronous sections as a function of the variance of the density function of the doubling times can be considered as a sensitivity analysis. Such simulations of non-synchronous systems are outside the scope of the considerations presented here.
The complexity of the systems studied, in particular the fractal structures, represent emergent properties resulting from the elementary rules of L-systems. They demonstrate the role of fractal structures in nature. However, in the context of the study of occurring patterns, they are less important than locally testable properties, such as the range of values of differences in size indices or shorter sequences of orientations.