Novel Features for Binary Time Series Based on Branch Length Similarity Entropy

Branch length similarity (BLS) entropy is defined in a network consisting of a single node and branches. In this study, we mapped the binary time-series signal to the circumference of the time circle so that the BLS entropy can be calculated for the binary time-series. We obtained the BLS entropy values for “1” signals on the time circle. The set of values are the BLS entropy profile. We selected the local maximum (minimum) point, slope, and inflection point of the entropy profile as the characteristic features of the binary time-series and investigated and explored their significance. The local maximum (minimum) point indicates the time at which the rate of change in the signal density becomes zero. The slope and inflection points correspond to the degree of change in the signal density and the time at which the signal density changes occur, respectively. Moreover, we show that the characteristic features can be widely used in binary time-series analysis by characterizing the movement trajectory of Caenorhabditis elegans. We also mention the problems that need to be explored mathematically in relation to the features and propose candidates for additional features based on the BLS entropy profile.


Introduction
Time series is a ubiquitous and widely used data type owing to the prevalence of Internet-based network information. Data are generated in several fields, including medicine and healthcare, science, finance, economics, government, industry, environmental science, and socio-economics [1][2][3][4][5][6]. Thus, over the past decade, researchers have developed various approaches to analyze data to understand the properties of various systems in diverse fields [7][8][9][10]. The purpose of analysis is primarily to predict signal occurrence, classify time series into one or several classes, detect anomalies or motifs contained in the data [11], or quantify similarities (or dissimilarities) between time series [12,13]. The approaches can be classified into four categories depending on the purpose.
Distance-based approaches use the similarity between time series. Among them, Euclidean distance (ED) measurements and dynamic time distortion (DTW) measurements are widely used. ED calculates the similarity as the square root of the sum of squared differences between elements corresponding to the same time position in two time series with the same length. While this measurement is simple and intuitive, it has the disadvantage of being overly sensitive to outliers, making it difficult to compare time series of different lengths. DTW, which overcomes the abovementioned drawback, is an effective method of minimizing the effects of motion and distortion over time by allowing the detection of a similar shape out of phase through an "elastic" transformation of the time series [14,15]. Lines and Bagnall compared several distance measures and showed that no distance measure significantly outperforms DTW [16].
Feature-based approaches extract structural features that reflect the properties of the time series and analyze the extracted features using an existing classification method [17]. These methods typically include signal processing techniques using various transforms, the local maximum point. In the discussion section, we briefly mention the problems that need to be explored mathematically in relation to the features proposed in this study and propose candidates for additional features based on the BLS entropy profile.

BLS Entropy and Its Profile
BLS entropy is observed in simple networks consisting of a node and several branches connected to the node [37][38][39] (Figure 1). The ratio of the length of each branch to the sum of the lengths of all branches is defined as the probability of each branch, as follows: where n is the number of branches in the network, and L k represents the length of the kth branch (k = 1, 2, 3, . . . , n). Thus, the BLS entropy can be mathematically written as: p j log(p j )/ log(n) (2) amine the applicability of the features, we compared the crawling trajectories of Caenorhabditis elegans exposed to two toxic substances (benzene and formaldehyde) using the local maximum point. In the discussion section, we briefly mention the problems that need to be explored mathematically in relation to the features proposed in this study and propose candidates for additional features based on the BLS entropy profile.

BLS Entropy and its Profile
BLS entropy is observed in simple networks consisting of a node and several branches connected to the node [37][38][39] (Figure 1). The ratio of the length of each branch to the sum of the lengths of all branches is defined as the probability of each branch, as follows: where n is the number of branches in the network, and Lk represents the length of the k th branch (k = 1, 2, 3,…, n). Thus, the BLS entropy can be mathematically written as: The higher the similarity of the values of all the branches of the network, the closer the entropy (S) is to 1.0, and the lower the similarity, the closer it is to 0.0 [18]. For clarity, examples are provided in which the difference in length between branches is relatively large or small. When the length difference was large, the S value was low, and vice versa.

Time Circle for a Time Series
We introduce the concept of a time circle to calculate the BLS entropy value for the signal "1" in the binary time-series. A binary time-series is a sequence of "1" or "0" signals on the discrete time axis. In Figure 2, the upper image represents a binary time-series consisting of 400 random distributed "1" signals. We connected the first and last signals of the binary time-series by sequentially mapping the "1" signals to the circumference of a circle. We refer to this circle as the time circle. Without the time circle, it would be difficult to find a physical quantity that corresponds to the branch length of the BLS entropy in a time series. If the distance between signals in a time series is defined as the branch length, several signals that are very far apart from each other converge the BLS entropy value to zero, regardless of the distribution of the entire signal. In other words, the time circle prevents the information on the entire structure of the time series from being diluted by distant signals. After a time circle is created for a binary time-series, we can calculate the BLS entropy value for a signal by concatenating the signal with all the other signals. Then, we The higher the similarity of the values of all the branches of the network, the closer the entropy (S) is to 1.0, and the lower the similarity, the closer it is to 0.0 [18]. For clarity, examples are provided in which the difference in length between branches is relatively large or small. When the length difference was large, the S value was low, and vice versa.

Time Circle for a Time Series
We introduce the concept of a time circle to calculate the BLS entropy value for the signal "1" in the binary time-series. A binary time-series is a sequence of "1" or "0" signals on the discrete time axis. In Figure 2, the upper image represents a binary time-series consisting of 400 random distributed "1" signals. We connected the first and last signals of the binary time-series by sequentially mapping the "1" signals to the circumference of a circle. We refer to this circle as the time circle. Without the time circle, it would be difficult to find a physical quantity that corresponds to the branch length of the BLS entropy in a time series. If the distance between signals in a time series is defined as the branch length, several signals that are very far apart from each other converge the BLS entropy value to zero, regardless of the distribution of the entire signal. In other words, the time circle prevents the information on the entire structure of the time series from being diluted by distant signals. After a time circle is created for a binary time-series, we can calculate the BLS entropy value for a signal by concatenating the signal with all the other signals. Then, we can obtain the entropy profile by sequentially calculating the entropy values for the signals along the direction of the time flow (bottom image of Figure 2). can obtain the entropy profile by sequentially calculating the entropy values for the signals along the direction of the time flow (bottom image of Figure 2).

Characteristic Features of Binary Time-Series Appearing in the BLS Entropy Profile
To understand the binary time-series structure, we explored the local maximum (minimum) point, slope, and inflection point of the BLS entropy profile as the characteristic features of the binary time-series. To this end, we created a binary time-series Q(t) of length L (= 10,000) consisting of "1" signals heterogeneously distributed on the time axis. Using the neutral theory [39,40], we generated several heterogeneous landscapes determined by the control variable H, ranging from 0.0 to 1.0 ( Figure 3A, left). The closer the H value is to 0.0, the sharper the peaks in the landscape, and the closer it is to 1.0, the smoother the peaks. The heights of the peaks in the landscape have a value between 0.0 and 1.0. Then, we assigned the values of 1 and 0 to grid sites where the height of the landscape was above and below 0.5, respectively ( Figure 3A, right). Yellow and dark blue represent grid sites with the values of 1 and 0, respectively. Next, we created a single vector of length 10,000 by sequentially concatenating the column vectors of the binary image. This vector is a binary time-series Q(t) with the heterogeneity H ( Figure 3B). Finally, we obtained the BLS entropy profile S( ) from the time circle for Q(t) ( Figure 3C). Here, t represents all the time values on the time axis (t = 1, 2,...), and ̃ is an array of t values where the signal "1" is located. Therefore, the length of ̃ is always less than or equal to t. We subjectively selected four interesting domains in S( ) and investigated their corresponding domains in Q(t). S( ) and Q(t) are represented by squares of the same color. Within the blue square of S( ) , the BLS entropy value gradually increases, whereas the signal density in the domain in Q(t) slowly decreases. When the inside of the blue square of Q(t) was enlarged, bands composed of bundles of "1" signals appeared, and the band length tended to decrease slightly along the time axis (see the top of Figure 3D). This reflects the fact that high (low) BLS entropy values indicate low (high) signal densities in the binary time-series. We prove this mathematically in Appendix A. In the red square of S( ) , the slope of the entropy profile is almost zero, and the BLS entropy value is relatively low. The red square of Q(t) has a relatively high signal density and, when enlarged, shows an almost uniform signal distribution (see the bottom of Figure 3D). Referring to the proof in Appendix A, the red square should show a tendency of decrease and then increase in the signal density; however, it shows a uniform trend. This is because the slope values of

Characteristic Features of Binary Time-Series Appearing in the BLS Entropy Profile
To understand the binary time-series structure, we explored the local maximum (minimum) point, slope, and inflection point of the BLS entropy profile as the characteristic features of the binary time-series. To this end, we created a binary time-series Q(t) of length L (= 10,000) consisting of "1" signals heterogeneously distributed on the time axis. Using the neutral theory [39,40], we generated several heterogeneous landscapes determined by the control variable H, ranging from 0.0 to 1.0 ( Figure 3A, left). The closer the H value is to 0.0, the sharper the peaks in the landscape, and the closer it is to 1.0, the smoother the peaks. The heights of the peaks in the landscape have a value between 0.0 and 1.0. Then, we assigned the values of 1 and 0 to grid sites where the height of the landscape was above and below 0.5, respectively ( Figure 3A, right). Yellow and dark blue represent grid sites with the values of 1 and 0, respectively. Next, we created a single vector of length 10,000 by sequentially concatenating the column vectors of the binary image. This vector is a binary time-series Q(t) with the heterogeneity H ( Figure 3B). Finally, we obtained the BLS entropy profile S( t) from the time circle for Q(t) ( Figure 3C). Here, t represents all the time values on the time axis (t = 1, 2,...), and t is an array of t values where the signal "1" is located. Therefore, the length of t is always less than or equal to t. We subjectively selected four interesting domains in S( t) and investigated their corresponding domains in Q(t). S( t) and Q(t) are represented by squares of the same color. Within the blue square of S( t), the BLS entropy value gradually increases, whereas the signal density in the domain in Q(t) slowly decreases. When the inside of the blue square of Q(t) was enlarged, bands composed of bundles of "1" signals appeared, and the band length tended to decrease slightly along the time axis (see the top of Figure 3D). This reflects the fact that high (low) BLS entropy values indicate low (high) signal densities in the binary time-series. We prove this mathematically in Appendix A. In the red square of S( t), the slope of the entropy profile is almost zero, and the BLS entropy value is relatively low. The red square of Q(t) has a relatively high signal density and, when enlarged, shows an almost uniform signal distribution (see the bottom of Figure 3D). Referring to the proof in Appendix A, the red square should show a tendency of decrease and then increase in the signal density; however, it shows a uniform trend. This is because the slope values of the entropy profile are relatively low. The black square of S( t) contains the inflection point. The black square of Q(t) is in contact with the domain with a relatively high signal density on the left. Therefore, the position of the inflection point in S( t) can be inferred as the time (t = τ) at which a significant change in the signal density occurs in Q(t). A significant change indicates that the degree of increase or  We investigated the BLS entropy profile for a binary time-series with a simpler structure to facilitate a better understanding of the features. We created three uniform binary time-series, Qj(t) =1 (j = 1, 2, 3), with L = 500 and different signal densities: t ={1, 3, 5,...} for j = 1, t = {1, 6, 11,...} for j = 2, and t = {1, 11, 21,...} for j =3. We combined Q1(t) and Q2(t), as well as Q1(t), Q2(t), and Q3(t) to create two binary time-series, Q12(t) and Q123(t), with L = 1000 and 1500, respectively ( Figure 4A). Therefore, the value of τ for Q12(t) is 500, and the values of τ for Q123(t) are 500 and 1000. S( ) for Q12(t) has an inflection point at ̃ = 26, which exactly corresponds to t = 500, whereas it has a local maximum and minimum at ̃ = 13 and 75, respectively ( Figure 4B). The two values correspond to t = 225 and 725 for Q12(t). Both values are near the center of Q1(t) and Q2(t). The inflection points for Q123(t) are at ̃ = 16 and 126, which exactly correspond to t = 500 and 1000, respectively. The local maximum and minimum values of S( ) correspond to those near the central position on Q1(t), Q2(t), and Q3(t). As the BLS entropy profile is obtained by correlating all the signals in a binary time-series, the partial structure of the entropy profile contains information about the entire time series [39]. Therefore, the local maximum (minimum) point may deviate from the center point to some extent, depending on the overall signal distribution.
With conventional methods, determining the value of τ (the position of the inflection point in S( ) ) is difficult for a time series in which the signal density changes gradually and continuously. For example, let us consider a single Q(t) generated by combining two binary time-series Q1(t) = 1 for t = {1, 2, 4, 7,…, 497} and Q2(t) = 1 for t = {500, 501, 503, 506,..., We investigated the BLS entropy profile for a binary time-series with a simpler structure to facilitate a better understanding of the features. We created three uniform binary time-series, Q j (t) = 1 (j = 1, 2, 3), with L = 500 and different signal densities: t = {1, 3, 5,...} for j = 1, t = {1, 6, 11,...} for j = 2, and t = {1, 11, 21,...} for j = 3. We combined Q 1 (t) and Q 2 (t), as well as Q 1 (t), Q 2 (t), and Q 3 (t) to create two binary time-series, Q 12 (t) and Q 123 (t), with L = 1000 and 1500, respectively ( Figure 4A). Therefore, the value of τ for Q 12 (t) is 500, and the values of τ for Q 123 (t) are 500 and 1000. S( t) for Q 12 (t) has an inflection point at t = 26, which exactly corresponds to t = 500, whereas it has a local maximum and minimum at t = 13 and 75, respectively ( Figure 4B). The two values correspond to t = 225 and 725 for Q 12 (t). Both values are near the center of Q 1 (t) and Q 2 (t). The inflection points for Q 123 (t) are at t = 16 and 126, which exactly correspond to t = 500 and 1000, respectively. The local maximum and minimum values of S( t) correspond to those near the central position on Q 1 (t), Q 2 (t), and Q 3 (t). As the BLS entropy profile is obtained by correlating all the signals in a binary time-series, the partial structure of the entropy profile contains information about the entire time series [39]. Therefore, the local maximum (minimum) point may deviate from the center point to some extent, depending on the overall signal distribution.
density reaches the threshold set as the τ value. With this approach, the value of τ depends on the window size and threshold value. Alternatively, our approach can determine the value of τ by simply determining the inflection point of the BLS entropy profile for Q(t) ( Figure 4C, left). The inflection points of S( ) are ̃ = 15 and 76 (the right of Figure 4C), which are marked on Q(t) by two red lines. When differentiating S( ) to find the inflection point, two peaks are observed at ̃ = 44 and 46 for S( ) . This is because the lengths of the two BLS entropy profiles corresponding to Q1(t) and Q2(t) are not the same.  Two binary time-series, Q 12 (t) and Q 123 (t), created by combining Q 1 (t), Q 2 (t), and Q 3 (t). Q 12 (t) was created by the sequential combination of Q 1 (t) and Q 2 (t), and Q 123 (t) was generated by the combination of Q 1 (t), Q 2 (t), and Q 3 (t). (B) BLS entropy profiles corresponding to Q 12 (t) and Q 123 (t), and (C) binary time-series, Q(t), in which the signal density linearly decreases and then increases, its BLS entropy profile, and the derivative function of the entropy profile (S t ).
With conventional methods, determining the value of τ (the position of the inflection point in S( t)) is difficult for a time series in which the signal density changes gradually and continuously. For example, let us consider a single Q(t) generated by combining two binary time-series Q 1 (t) = 1 for t = {1, 2, 4, 7, . . . , 497} and Q 2 (t) = 1 for t = {500, 501, 503, 506,..., 996} ( Figure 4C). Q(t) shows a tendency of a gradual decrease and then increase again for the signal density. Here, to determine the value of τ, we should create a small time window on Q(t) and calculate the signal density within the window while shifting it in the direction of the time flow. Then, we need to find the location of the window where the density reaches the threshold set as the τ value. With this approach, the value of τ depends on the window size and threshold value. Alternatively, our approach can determine the value of τ by simply determining the inflection point of the BLS entropy profile for Q(t) (Figure 4C, left). The inflection points of S( t) are t = 15 and 76 (the right of Figure 4C), which are marked on Q(t) by two red lines. When differentiating S( t) to find the inflection point, two peaks are observed at t = 44 and 46 for S( t). This is because the lengths of the two BLS entropy profiles corresponding to Q 1 (t) and Q 2 (t) are not the same.
In Figure 5, we examine the features of a binary time-series with H = 0.3. The local maximum (minimum) points (indicated by black triangles) and inflection points (indicated by red triangles) can be observed in the BLS entropy profile. The corresponding positions in Q(t) for the triangle positions are indicated by lines sequentially for each color. The results showed that a feature of BLS entropy sensitively detects moments of change in the signal density in a binary time-series visually.

Application: Characterization of Crawling Trajectories of Caenorhabditis Elegans
Caenorhabditis elegans has 302 neurons, and their connections are well known. In addition, the worms have a transparent body, and are hence easy to observe with the eye. Because of these characteristics, they have been widely used for exploring the relationship between neural control and biomechanics in organisms [41,42]. This relationship has been revealed by analyzing the crawling behavior on agar or swimming behavior in water [43]. The analyses require algorithms to characterize behavioral trajectories.
In this study, we converted the trajectory of C. elegans into a binary time-series and quantified the trajectory using the characteristic features of the BLS entropy profile. To this end, we conducted experiments on the behavior of 30 wild-type adult C. elegans on agar. The worms were cultured in a Petri dish (60 mm in diameter, 15 mm in height) filled with nematode growth medium in an incubator at 20 °C. The OP50 strain of E. coli was used as food for the worms. The test worms were allowed to acclimate for 15 min before recording their behavior. We performed 10 replicate experiments with different individuals for a control group, a group exposed to 0.5 ppm benzene, and a group exposed to 0.5 ppm formaldehyde. The crawling activity of the worms was monitored using a Sony digital camcorder mounted vertically over 20 min at a frame resolution of 1/24 s. We extracted the central coordinate point of the worm body from each frame of the recorded movie. The coordinate points were used as the trajectory of the crawling movement. Figure 6 shows the typical trajectories of the worms belonging to the control, benzene-treated, and formaldehyde-treated groups. We measured the angle formed by the movement direction of the worm at times t and t + 1 to convert the two-dimensional trajectory into a binary time-series. The angle was converted to a binary number depending on the category to which it belongs. Here, the plane in which the individual moves is divided into eight angular categories, 0°-45°, 45°-90°,..., 315°-360°, and binarized into (001), (010),..., and (111), respectively. For example, if the worm advances by changing its direction by 25° with respect to the movement direction at time t, we can express the angle at time t+1 as the binary number, "001." The trajectories of the control group tended to be complex in a long time scale and relatively simple sinusoidal in a short time scale. The trajectories of the benzene-treated group showed a simple movement in both the long and short time scales compared with those of the control group. On the other hand, the trajectories of the formaldehyde-treated group were simpler than those of the control group and slightly more complicated than those of the benzene-treated group. The trajectories appeared to be somewhat differentiated for each group based on the number of local maximum points in the BLS entropy profile. The "findpeaks" function supported by MATLAB (MathWorks, 2019) was used to find the number of local maximums. This function finds only those peaks that have heights above a certain threshold. When the threshold value was too low, too many peaks were found, whereas when the threshold value was too high, only a few

Application: Characterization of Crawling Trajectories of Caenorhabditis Elegans
Caenorhabditis elegans has 302 neurons, and their connections are well known. In addition, the worms have a transparent body, and are hence easy to observe with the eye. Because of these characteristics, they have been widely used for exploring the relationship between neural control and biomechanics in organisms [41,42]. This relationship has been revealed by analyzing the crawling behavior on agar or swimming behavior in water [43]. The analyses require algorithms to characterize behavioral trajectories.
In this study, we converted the trajectory of C. elegans into a binary time-series and quantified the trajectory using the characteristic features of the BLS entropy profile. To this end, we conducted experiments on the behavior of 30 wild-type adult C. elegans on agar. The worms were cultured in a Petri dish (60 mm in diameter, 15 mm in height) filled with nematode growth medium in an incubator at 20 • C. The OP50 strain of E. coli was used as food for the worms. The test worms were allowed to acclimate for 15 min before recording their behavior. We performed 10 replicate experiments with different individuals for a control group, a group exposed to 0.5 ppm benzene, and a group exposed to 0.5 ppm formaldehyde. The crawling activity of the worms was monitored using a Sony digital camcorder mounted vertically over 20 min at a frame resolution of 1/24 s. We extracted the central coordinate point of the worm body from each frame of the recorded movie. The coordinate points were used as the trajectory of the crawling movement. Figure 6 shows the typical trajectories of the worms belonging to the control, benzene-treated, and formaldehyde-treated groups. We measured the angle formed by the movement direction of the worm at times t and t + 1 to convert the two-dimensional trajectory into a binary time-series. The angle was converted to a binary number depending on the category to which it belongs. Here, the plane in which the individual moves is divided into eight angular categories, 0 • -45 • , 45 • -90 • ,..., 315 • -360 • , and binarized into (001), (010),..., and (111), respectively. For example, if the worm advances by changing its direction by 25 • with respect to the movement direction at time t, we can express the angle at time t+1 as the binary number, "001." The trajectories of the control group tended to be complex in a long time scale and relatively simple sinusoidal in a short time scale. The trajectories of the benzene-treated group showed a simple movement in both the long and short time scales compared with those of the control group. On the other hand, the trajectories of the formaldehyde-treated group were simpler than those of the control group and slightly more complicated than those of the benzene-treated group. The trajectories appeared to be somewhat differentiated for each group based on the number of local maximum points in the BLS entropy profile. The "findpeaks" function supported by MATLAB (MathWorks, 2019) was used to find the number of local maximums. This function finds only those peaks that have heights above a certain threshold. When the threshold value was too low, too many peaks were found, whereas when the threshold value was too high, only a few peaks were found. We tested several thresholds and chose an appropriate value (0.01). For the BLS entropy profiles for the control, benzene-treated, and formaldehyde-treated groups, the numbers of local maximum points were (mean, standard deviation) = (8.25, 3.15), (3.87, 3.75), and (7.0, 3.10), respectively. peaks were found. We tested several thresholds and chose an appropriate value (0.01). For the BLS entropy profiles for the control, benzene-treated, and formaldehyde-treated groups, the numbers of local maximum points were (mean, standard deviation) = (8.25, 3.15), (3.87, 3.75), and (7.0, 3.10), respectively. The trajectories of the control group were not statistically different from those of the formaldehyde-treated group, whereas they were significantly different from those of the benzene-treated group. In addition, there was no statistical difference between the trajectories of the benzene-treated and formaldehyde-treated groups (one-way ANOVA and Scheffe post-test, p < 0.05).
In this application case, we showed that the features defined in the BLS entropy profile for a binary time-series can be effectively used through an appropriate binarization process even when analyzing non-binary time-series data, such as the behavior trajectory of an organism.

Discussion
In this study, we introduced the concept of a time circle, which allows the calculation of the BLS entropy profile from a binary time-series. Using the time circle, we observed that the local maximum (minimum) point, slope, and inflection point of the BLS entropy profile indicate the time at which the rate of change in the signal density becomes zero, the degree of change in the signal density, and the time at which the change in the signal density begins, respectively. In addition, through an application example, we showed that the findings are applicable to problems in various fields. In this section, we propose some candidates that capture characteristic features of a binary time-series.
The entropy profiles for Q2(t) and Q3(t), shown in Figure 7, contain several peaks. We observed that the peaks appeared when the distance between adjacent signals was suddenly lengthened and shortened. Understanding the peak generation in relation to the threshold would provide insights into a feature that can sensitively detect abnormal signals. The trajectories of the control group were not statistically different from those of the formaldehyde-treated group, whereas they were significantly different from those of the benzene-treated group. In addition, there was no statistical difference between the trajectories of the benzene-treated and formaldehyde-treated groups (one-way ANOVA and Scheffe post-test, p < 0.05).
In this application case, we showed that the features defined in the BLS entropy profile for a binary time-series can be effectively used through an appropriate binarization process even when analyzing non-binary time-series data, such as the behavior trajectory of an organism.

Discussion
In this study, we introduced the concept of a time circle, which allows the calculation of the BLS entropy profile from a binary time-series. Using the time circle, we observed that the local maximum (minimum) point, slope, and inflection point of the BLS entropy profile indicate the time at which the rate of change in the signal density becomes zero, the degree of change in the signal density, and the time at which the change in the signal density begins, respectively. In addition, through an application example, we showed that the findings are applicable to problems in various fields. In this section, we propose some candidates that capture characteristic features of a binary time-series.
The entropy profiles for Q 2 (t) and Q 3 (t), shown in Figure 7, contain several peaks. We observed that the peaks appeared when the distance between adjacent signals was suddenly lengthened and shortened. Understanding the peak generation in relation to the threshold would provide insights into a feature that can sensitively detect abnormal signals. No-signals, that is "0" signals between a signal "1" and its neighboring signal "1," cause a stepped structure in the BLS entropy profile. This structure is shown in the entropy profile in Figure 3C. Through a quantitative analysis of this effect, we selected the height as a feature that detects a specific no-signal distribution. Figure 6 shows that the BLS entropy profile for the movement trajectory of C. elegans tends to increase or decrease in the long and short time scales. We can use the Fourier transform algorithm to separate the high and low frequencies and analyze the tendency at different time scales separately, which could allow an elaborate classification of more sophisticated time-series structures.
Another interesting feature is the similarity between two binary time-series based on the BLS entropy profile. Given a binary time-series Q1(t), we can create a time series Q2(t) by adding several signals close to each other to Q1(t), and another time series Q3(t) by adding the same number of signals scattered from each other to Q1(t). ED-based similarity indicates that the similarity between Q1(t) and Q2(t) and the similarity between Q1(t) and Q3(t) are the same. In other words, the ED does not reflect information on signal distribution. A new similarity (ρ) that overcomes this drawback can be defined as the correlation coefficient between the entropy profiles as follows: here, "corr" represents the correlation coefficient between the two BLS entropy profiles, ( ) and ( ) . "shifted( ( ) )" shifts ( ) cyclically by one time step. "max" determines the largest value among the calculated correlation coefficients. As the BLS entropy profile contains information about the relative distances between all "1" signals in the time series, ρ is significantly different from the existing distance-based similarity. Figure 7 shows the binary time-series, Q1(t), Q2(t), and Q3(t) for H = 0.0, 0.3, and 1.0, respectively, and their BLS entropy profiles. We compared the ED and values for the two pairs of time series. In the ED, the similarity between Q2(t) and Q3(t) was the highest, followed by the similarity between Q1(t) and Q3(t) and the similarity between Q1(t) and Q2(t), in that order. However, the similarity between Q1(t) and Q2(t) visually appears to be the No-signals, that is "0" signals between a signal "1" and its neighboring signal "1," cause a stepped structure in the BLS entropy profile. This structure is shown in the entropy profile in Figure 3C. Through a quantitative analysis of this effect, we selected the height as a feature that detects a specific no-signal distribution. Figure 6 shows that the BLS entropy profile for the movement trajectory of C. elegans tends to increase or decrease in the long and short time scales. We can use the Fourier transform algorithm to separate the high and low frequencies and analyze the tendency at different time scales separately, which could allow an elaborate classification of more sophisticated time-series structures.
Another interesting feature is the similarity between two binary time-series based on the BLS entropy profile. Given a binary time-series Q 1 (t), we can create a time series Q 2 (t) by adding several signals close to each other to Q 1 (t), and another time series Q 3 (t) by adding the same number of signals scattered from each other to Q 1 (t). ED-based similarity indicates that the similarity between Q 1 (t) and Q 2 (t) and the similarity between Q 1 (t) and Q 3 (t) are the same. In other words, the ED does not reflect information on signal distribution. A new similarity (ρ) that overcomes this drawback can be defined as the correlation coefficient between the entropy profiles as follows: here, "corr" represents the correlation coefficient between the two BLS entropy profiles, S 1 t and S 2 t . "shifted (S 2 t )" shifts S 2 t cyclically by one time step. "max" determines the largest value among the calculated correlation coefficients. As the BLS entropy profile contains information about the relative distances between all "1" signals in the time series, ρ is significantly different from the existing distance-based similarity. Figure 7 shows the binary time-series, Q 1 (t), Q 2 (t), and Q 3 (t) for H = 0.0, 0.3, and 1.0, respectively, and their BLS entropy profiles. We compared the ED and ρ values for the two pairs of time series. In the ED, the similarity between Q 2 (t) and Q 3 (t) was the highest, followed by the similarity between Q 1 (t) and Q 3 (t) and the similarity between Q 1 (t) and Q 2 (t), in that order. However, the similarity between Q 1 (t) and Q 2 (t) visually appears to be the highest.
The ρ measurement showed that the similarity between Q 1 (t) and Q 2 (t) was the highest, followed by the similarity between Q 2 (t) and Q 3 (t) and the similarity between Q 1 (t) and Q 3 (t), in that order. Even through a visual comparison, our similarity has an advantage over the existing distance-based similarity. In addition to ρ, which we used, Ultrametric distance measurement based on the Pearson linear correlation coefficient and Menhattan distance measurement can be used [44]. It would be interesting to study to determine an appropriate correlation coefficient according to the structure of the time series. It is worthwhile to propose the concept of the time circle, which allows the calculation of the BLS entropy value for a binary time-series and shows that the characteristic features of the time series can be captured from the BLS entropy profile. In addition, we believe that various features can be defined based on the BLS entropy profile, and that they can be effectively applied to various problems expressed in binary time-series.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Suppose there is a simple network in a time circle with two line segments, pq and pr, as branches in the BLS entropy. The segments have the lengths of L 1 and L 2 , respectively ( Figure A1A). Here, L 1 is fixed. We can calculate the BLS entropy value, S, for the network by approaching point r from near point p (zone 1) to near point q (zone 2) along the circumference of the time circle ( Figure A1B). To determine the value of L 2 that maximizes S, we can differentiate S with respect to L 2 and find the point at which it becomes zero, as follows: The value of L 2 that satisfies the above condition, namely L * 2 , can be expressed as follows: If the point r is very close to the point q, L 1 and L 2 become almost the same, which maximizes the BLS entropy value. Therefore, zone 2, including points q and r, has a higher signal density than zone 1, whereas at point p (zone 1), the BLS entropy value is relatively high. Conversely, when point r is located in zone 1, the density of zone 1 increases, and the BLS entropy value of point q in zone 2 increases. If we create a binary time-series with a signal density distribution using this process, the BLS entropy value is lower in the zone with a relatively higher signal density, whereas it is higher in the zone with a lower signal density. Conversely, when the point r is located in zone 1, the density of zone 1 increases, and the BLS entropy value of the point q in zone 2 increases. We can extend this result to the case of point p with n line segments. Let S(L 1 , L 2 , . . . , L n−1 ) be the BLS entropy value at point p which is the root node. Here, L i is the length of the line segment connecting the point p and the i-th point on the time circle.
When new point x is added in the time circle, we consider f (x) = S(L 1 , L 2 , . . . , L n−1 , L x ) where L x is the length of px. Then f(x) has the graph similar to Figure A1B (see the inset graph) and the maximum of the BLS entropy value at x * , satisfying Moreover: S(L 1 , L 2 , . . . , L n−1 ) = = S(L 1 , L 2 , . . . , L n−1 , 0) (A5) Therefore, there exists some δ > 0 such that S(L 1 , L 2 , . . . , L n−1 , L x ) < S(L 1 , L 2 , . . . , L n−1 ) for x with |p-x| < δ. If we create a binary time-series with a signal density distribution using this process, we can see that the BLS entropy value is lower (higher) in the region where the signal density is relatively higher (lower). for x with |p-x| < δ. If we create a binary time-series with a signal density distribution using this process, we can see that the BLS entropy value is lower (higher) in the region where the signal density is relatively higher (lower).