Assigning Degrees of Stochasticity to Blazar Light Curves in the Radio Band Using Complex Networks

We focus on characterizing the high-energy emission mechanisms of blazars by analyzing the variability in the radio band of the light curves of more than a thousand sources. We are interested in assigning complexity parameters to these sources, modeling the time series of the light curves with the method of the Horizontal Visibility Graph (HVG), which allows us to obtain properties from degree distributions, such as a characteristic exponent to describe its stochasticity and the Kullback–Leibler Divergence (KLD), presenting a new perspective to the methods commonly used to study Active Galactic Nuclei (AGN). We contrast these parameters with the excess variance, which is an astronomical measurement of variability in light curves; at the same time, we use the spectral classification of the sources. While it is not possible to find significant correlations with the excess variance, the degree distributions extracted from the network are detecting differences related to the spectral classification of blazars. These differences suggest a chaotic behavior in the time series for the BL Lac sources and a correlated stochastic behavior in the time series for the FSRQ sources. Our results show that complex networks may be a valuable alternative tool to study AGNs according to the variability of their energy output.


Introduction
Complex networks are a powerful tool to study physical phenomena in a wide variety of systems and topics from a different perspective than usual approaches [1]. Depending on the characteristics to be explored in each investigation, there are different types of graphs and representation structures to consider. Here, we are interested in modeling time series with the techniques from the family of visibility algorithms [2]. In one of the first approaches to astrophysical systems through the use of Horizontal Visibility Graph (HVG), we have shown that this method is able to detect differences in particle velocity distributions in plasma simulations [3]. Moreover, the HVG has proved to be a robust method to characterize the solar wind plasma and has been used to study turbulent magnetic field [4], velocity fluctuations [5] and light curves of pulsating variable stars [6]. Being the closest star to Earth, the Sun and the solar wind are arguably the most studied astrophysical systems, corresponding to a valuable laboratory of natural plasma physics. During the last decades, several space missions have been launched and surveyed the space environment, making many discoveries. In contrast, the study of distant objects, such as blazars, has comparatively fewer high-quality datasets available.
Angel and Stockman [7] indicated that the word blazar was proposed by Edward A. Spiegel in the Pittsburgh Conference on BL Lac Objects in 1978, which is a combination of BL Lacertae object and quasar. Blazars are a particular type of active galactic nuclei (AGN). Emission within an AGN is produced by the accretion of matter from a black hole at its center, where the surrounding material forms an accretion disk that is heated by the dissipation of gravitational energy, generating in some cases the expulsion of matter and energy in relativistic jets. An AGN is powered by the conversion of gravitational potential energy into radiation, although the rotational kinetic energy of the black hole can also serve as an important energy source; moreover, plasma jets are formed when the black hole rotates and the accretion disk is strongly magnetized [8]. The above details comprise what we can consider the main idea of the unified model of an AGN. This model accounts for observational differences among AGNs, which are due to the different orientations of the objects as seen from Earth and the different accretion rates and masses of the central black holes [9]. Observations show that a blazar is an AGN with a jet of matter moving at relativistic velocities oriented near our line of sight. Blazars are the most violent AGN, emitting predominantly non-thermal radiation with strong variability across the electromagnetic spectrum [8], from the radio band to extremely high gamma-ray energies on time scales that can be as short as minutes. Among blazars, we can distinguish BL Lacertae (BL Lac) objects, which have weak or continuous featureless emission lines in the optical spectrum, and flat-spectrum radio quasars (FSRQ), which have prominent emission lines in the optical spectrum.
The variability of blazars can be observed in different energy bands. To investigate the physical mechanisms that generate the observed variations in blazar light curves, many studies have been carried out at various wavelengths. Some techniques work from the frequency domain and others work from the time domain. The main tool of analysis to use in different bands is to determine their Power Spectral Densities (PSD) [10], since the PSD can provide clues about the mechanism driving the variability. For instance, Max-Moerbeck et al. [11] have modeled light curves as red noise processes with the PSD to model the variability and to set constraints on the statistical significance of interband correlations. In addition, light curves and PSDs have been investigated with several other methods. Gaussian Process (GP) is especially useful for analyzing astronomical timeseries data, and there are even studies that have initiated more active methodological discussion on multiband time-series data by implementing multi-output GP [12]. In Tarnopolski et al. [13], the toolset includes Lomb-Scargle Periodogram (LSP) [14], wavelet scalogram [15], Autoregressive Moving Average process (ARMA) [16], Continuous-time ARMA (CARMA) [16], the Hurst exponent (H) [17] and others. In fact, an algebraic relationship between the H-exponent of the time series and the exponent of the power-law degree distribution of the visibility graph (non-horizontal) has been matched. It has been shown that the exponent of the power-law degree distribution depends linearly on H [2]. H measures the statistical auto-similarity of a time series, i.e., the long-range dependence or memory of a process. Small-scale studies have been made to classify light curves with the H, where it has been found that two FSRQs and four BL Lac exhibit long-term memory in the underlying process governing the optical variability of 44 identified blazar candidates [17]. The overall challenge is to apply effective techniques to model the complex nature of light curve variations that occur in different bands and time scales.
We have started a study of the variability properties in the radio band of blazars observed with a large-scale, fast cadence 15 GHz radio monitoring program with the Owens Valley Radio Observatory (OVRO) 40 m Telescope that has produced 12 years of data for over 1800 sources observed twice a week [18]. Thus, this set of time series corresponds to the most comprehensive study of blazar variability in the radio band available at this time and is ideal for conducting our study, presenting a new perspective on the methods commonly used to study AGNs. We focus on characterizing the high-energy emission mechanisms of blazars by analyzing the variability in the radio band of the light curves of more than a thousand sources. We seek to describe the light curves, i.e., to analyze the observed flux density as a function of time of these sources as a first approximation of the complexity parameters in active galactic nuclei. We are interested in assigning degrees of stochasticity to blazars, modeling the time series of light curves as complex networks. For this purpose, we rely on visibility algorithms that convert time series into graphs, where the structure of the series is preserved in the graph topology [19].
HVG [20] is a novel and direct method that can represent time series as a network according to a geometric criterion that considers the magnitude of the data and its horizontal visibility with others in the time domain. Here, we use directed and undirected HVG to get degree distributions in both cases, which allows us to calculate a characteristic exponent as the degree of stochasticity, which is also the Kullback-Leibler Divergence (KLD), respectively. The last one is known under a variety of names, including the Kullback-Leibler distance, cross-entropy, information divergence, and information for discrimination [21]. Meanwhile, stochastic processes also play a fundamental role in many scientific fields where we can find a dynamic in a collection of random variables evolving over time [22]. Luque et al. [23] has demonstrated that the method we apply here efficiently discriminates randomness and not only uncorrelated randomness from chaos, but also more complicated stochastic processes in time series can be identified, such as fractional Brownian motion. KLD is sensitive to non-evident characteristics of time series [3]. The HVG is a method that allows us to analyze time series and its irreversibility through the calculation of the Kullback-Leibler divergence (KLD). The advantage of KLD is that unlike other measures used to estimate irreversibility over time, KLD is statistically significant, as demonstrated by the Chernoff-Stein lemma [20]. Moreover, in the case of astronomical observations, it is common for the data measured by telescopes to be contaminated by atmospheric (reversible) noise, yet irreversible signals continue to be well characterized by the HVG method and the KLD measurement [20].
In this study, we model time series of the light curves with the algorithm of the Horizontal Visibility Graph to measure a characteristic exponent to describe its stochasticity and the Kullback-Leibler Divergence to detect a different behavior of the light curves of blazars. We will analyze if the properties of degrees distributions are connected with the spectral classification of blazars, and we are interested in contrasting with a common measurement of variability in light curves, the excess variance. To the best of our knowledge, this is the first approach to the study of blazars using HVG, which manages to identify different ranges of KLD for different light sources. The paper is organized as follows. Section 2 presents the HVG method and defines how we obtain the degree of stochasticity and the KLD value from the complex network. Section 3 explains technical details about these radio observations and shows a sample of a light curve to illustrate the characteristics of data series used by this study. Section 4 exposes our results after applying the method on all available blazar light curves, and we finally discuss these results in Section 5.

Horizontal Visibility Algorithm
The HVG method allows the study of dynamic systems through the characterization of their networks associated with the time series [24]. As we can see in Figure 1, the HVG algorithm consists first of assigning a longitudinal node to each data in the time series. Then, depending on its magnitude, the node acquires horizontal visibility to other past and future nodes in the sense of time; this is the directed Horizontal Visibility Graph (DHVG). In that way, the number of times that the node establishes an in and out connection is calculated to have degrees k in and k out for each node. With the already constructed DHVG, we can extract the undirected version (UHVG), just considering a total degree k ud = k in + k out .
There are more details about geometrical criteria in Acosta-Tripailao et al. [3]. Thus, by counting the frequency of occurrence of each degree, we obtain degrees distribution or probability distributions in the form P(k) = n k /n, where n is the number of data points in the time series and n k is the number of nodes having degree k. Namely, P = P(k ud ) for UHVG and P in = P(k in ) with P out = P(k out ) for DHVG. Each data correspond to a longitudinal node. (b) According to the visibility of each node, we can calculate the degrees k in and k out for directed HVG and k ud for undirected HVG: that is, how many times the node establishes a connection as a function of time direction and independently of this one.

UHVG to Evaluate γ-Exponent
With UHVG, any time series maps to a network with an exponential behavior for the degree distribution of the form P(k) = 1 3 2 3 k−2 [23]. This can be rewritten as with γ un = ln(3/2) ≈ 0.405, that is a limit for the uncorrelated situation proposed by Lacasa and Toral [25] to discriminate between correlated stochastic (γ > γ un ), or chaotic (γ < γ un ) processes. Here, γ is the characteristic exponent (the γ-exponent) of the degree distribution modeled as P(k)∼exp(−γk). The value for γ un was supported by analytical developments that confirmed the results provided by numerical simulations and experimental time series, but new studies show there are certain exceptions to this rule to take in consideration. Ravetti et al. [26], Zhang et al. [27] studied in depth the methodology proposed by Lacasa and Toral [25] and found several cases in which their hypothesis is not valid.
The choice of the domain for the fitted straight line of the logarithm of the probability distribution is very delicate. Ravetti et al. [26] found that sometimes, non-exponential behaviors occur, and the heavy tail of the degree distribution makes the method dependent on additional adjustments on a case-by-case basis. However, the gamma value gives useful information about the process thanks to the strengths of the HVG method that manages to maintain the intrinsic characteristics by mapping each time series conserving its properties, as long as exponential behaviors are obtained and the fitting zone is properly chosen to determine the γ-degree. Independent of the limit, it is a useful technique for a systematic analysis of long-and short-range stochastic processes with the right criteria in the fitting domain (k range).

DHVG to Estimate D-Value
Degree distributions P in and P out separately classify the succession between past and future events; that is, they provide information about the temporal irreversibility of the associated series. At the same time, they provide a relation with the entropy production of the physical mechanism generating the series [28]. A rigorous way to measure the difference between two degree distributions is through Kullback-Leibler divergence, which is a statistical measure of "distinguishability" [20] to quantify the degree of temporal irreversibility. The KLD between two probability functions is defined as [21] KLD is always positive and vanishes if and only if P in = P out . As D moves away from zero, the distance between the distributions increases and with it the irreversibility of the series.
Since it is not symmetrical, it is not a real distance measurement. In addition, there are some cases in which D → ∞ when P in = 0. We must take into consideration that some events are unseen, especially when dealing with observational data with short duration and gaps.
While the presence of gaps is not a problem for the HVG, it is not prudent to assume every event as absolutely impossible. Therefore, we reassign a new, very low, probability when it is zero. The cases where P out = 0 are contained in the definition itself. So, to cover the other cases, we smooth the probabilities in assigning a probability less than the minimum P min in in the form P min in /n when for certain k the probability is zero, and we subtract this new probability from the others to rescale. These considerations will allow us to compare degrees between spectral classes of blazars.

Blazar Light Curves
The blazar subclasses, BL Lac and FSRQ, are defined by the properties of their optical spectra. The spectra of FSRQs show broad emission lines, while BL Lacs show very weak or no emission lines [8,29]. Other properties of the sources are also correlated with the subclasses as described in the references above. In this article, we focus on working with observational data of blazars in the radio band.
The OVRO data were obtained from the OVRO 40 m Telescope Monitoring Program [18]. The telescope uses off-axis dual-beam optics in which the beamwidth (FWHM) is 157 arc seconds. The cryogenic receiver uses a HEMT amplifier and is centered at 15 GHz with 2 GHz equivalent noise bandwidth. Gain fluctuations and atmospheric and ground contributions are removed with the double-switching technique where one of the beams is always pointed at the source. Details of the observation and data reduction [18] cover the absolute calibration and the uncertainties, which include both the thermal fluctuations in the receiver and systematic errors that have been added under a rigorous procedure [18].
It is important to study the amplitude of variability in AGN light curves. The importance of variability lies in the fact that, being a unique property of blazars, it can be used as a tool to distinguish them from other astrophysical objects [17]. One of the quantities most commonly used to estimate this property is the excess variance. According to Turner et al. [30], the normalized excess variance is designating the flux density for the n points in each light curve as X i , with its arithmetic mean µ and errors σ i . The excess variance is useful for comparing the variability in different light curves. As a first application of complex networks to radio light curves, we use σ rms to contrast with measures that also distinguish the amplitude of the light curve variability, now according to its visibility (explained in Section 2). Note that we use the square root of the excess variance, which is also known as the fractional root mean square variability amplitude [31,32]. Both quantities give us the same information, but the last one is a linear statistic that gives it in percentage terms [10].

Results
We analyze the physical properties that could be conditioning the behavior of the light curves using DHVG and UHVG. Initially, we work with a data set that contains 1298 sources, where 400 are BL Lac and 898 are FSRQ. The optical classes are taken from the Roma BZCAT Multi-Frequency Catalog of Blazars 5th edition. The Multi-Frequency Catalogue of Blazars is one of the most complete lists of Active Galactic Nuclei whose emission properties are recognized as typical of blazars. It includes the list of sources and an essential compilation of multifrequency data from radio to gamma rays [33].
We applied the algorithm of HVG over the time series of the light curves of blazars finding an exponential behavior on the degree probability distributions in most light curves, as shown in Figure 2b for both methods UHVG (black dots) and DHVG (color solid lines). Once we calculate the degree probability distribution P(k) of the UHVG, we proceed to compute the critical exponent γ as the slope of the semilog plot with the method of least squares in a range from k = 3 to some k with the lowest probability, without including it and avoiding the low probability floor in the heavy tails (see Figure 2b in Section 4). When making this adjustment, we find that some degree probability distributions did not fit an exponential, so we discarded those light curves. Just 5% of BL Lac and 6.1% of FSRQ are discarded because they deviate from the fit (some selected and discarded cases in Figures A1-A4). Now, having the well-adjusted light curves, we obtain the critical exponent γ (UHVG) per source and we plot the PDF for these data sets, as is shown in Figure 3a. The statistical detail of those PDFs analyses is in Table 1. Figure 3a shows a clear difference between the light curves of BL Lac and the FSRQ. In the dotted line, we mark the limit proposed by Lacasa and Toral [25] between stochastic time series and chaos for the UHVG analysis. From Figure 3a, we observe when plotting the PDFs that the curves separate around this limit. That is, the BL Lac sources have a peak of the γ PDF on the left of the dotted line, with γ = 0.392, whereas the FSRQ sources show the peak of the γ PDF on the right of the limit γ un ≈ 0.405, with γ = 0.446. Thus, most BL Lac light curves show a chaotic behavior, while most FSRQ light curves show a time series with a correlated stochastic behavior.
As a second analysis, we calculate the KLD using the technique explained in Section 2.2; thus, the results obtained for D correspond to a smoother weighting of the original values. Now, it is possible to analyze cases of high irreversibility without necessarily assigning an infinity. Figure 3b shows the PDF for the value of D, which is a measure of the irreversibility of the time series. In this case, both classes of light curves show the same behavior of this parameter, low values of D between 10 −3 and 10 −1 as is shown in Table 2, but the peaks of these PDF do not show a difference between BL Lac and FSRQ.
In order to find possible correlations between parameters of the complex network extracted from these time series and a physical measure of variability in AGN, we compare the values of σ rms , that is the root square of the excess variance [30], and the critical exponent γ in Figure 4a and with the D value in Figure 4b, which are all dimensionless quantities.

Discussion
We have applied the method of DHVG and UHVG to a sample of 1298 light curves measured from blazars. From this analysis, we find an exponential behavior on the degree probability distribution P(k) for most studied sources (Figure 2b). We compute the critical exponent γ from the P(k) in the UHVG, noting that the degree distributions are capable of detecting differences in the spectral classification of blazars, as is shown in Figure 3a.
In fact, the division between the peak reached by the PDF of γ for the BL Lac is on the left of the γ un limit; meanwhile, the peak reached by the PDF of γ for the FSRQ sources is on the right of this limit. That difference suggests a chaotic behavior in the time series for the BL Lac sources and a correlated stochastic behavior in the time series for the FSRQ sources. This result indicates that the distribution of the degrees, i.e., how the flux density is distributed in time, is not the same for the light curves sources studied. So, the distribution of the degree k of the classes of light curves is not the same, and it seems that the critical exponent γ from the exponential adjustment could be useful to distinguish between these two types of blazars. On the other hand, when measuring the irreversibility of the time series with the DHVG, the distance D does not have the necessary sensibility to identify the two types of light sources (Figure 3b).
In Figure 4a,b, we plot the critical exponent γ (UHVG) versus the square root of the excess variance and the same for parameter D (DHVG) in order to find a correlation between complex networks and a variability parameter of blazars. The excess variance is a quantity that indicates the observed relative strength of the variability of an astronomical source. Figure 4a shows a slow tendency for blazar classes, whereas the KLD and the excess variance do not seem to exhibit a significant correlation in Figure 4b. However, as a first approach to the study of blazars with HVG, in Figure 4b, we managed to identify different ranges of D for different sources (between 10 −3 and 10 −1 ). On the other hand, it is recommended that for better comparison between sources, the excess variance should be calculated using observations of the same duration [30]. This data set contains time series from 259 to 1140 data points. Therefore, even if we could truncate the light curves to match the shortest observation, this would considerably reduce our data set and analysis. However, these quantities may not be the best parameters to consider.
Other ways to analyze the complex networks and their degrees distribution P(k) also can be considered. For instance, in many cases, it is useful to consider also the complementary cumulative distribution function or CDF of a variable with a power-law distribution [27,35]. However, here, we have systematically obtained exponential distributions represented by the γ exponent. Thus, here, we have focused on the use of the HVG as a method to distinguish the different light curves, and to tackle this purpose, we have used the HVG and analyzed the distribution of the values of the γ. In addition, it is known that when using the HVG, any random series results in a network with a degree distribution of exponential type, and it has been suggested that this is a universal feature [6]. If exponential forms are not obtained, the series are related to non-randomness [23]. Nevertheless, it would be interesting to explore with CDF to obtain perhaps a more robust statistical fitting, as discussed in the study of Zhang et al. [27]. We will continue the analysis using other tools from complex networks analysis as well as other physical parameters such as the PSD of the flux density time series. Although we did not find a clear correlation between γ or D and the excess variance, the PDF of the critical exponent of the degree probability distribution does show a clear difference between the two blazar classes. Thus, our results may open a new framework for the study of blazars, in which complex networks may be a valuable alternative tool to study AGNs according to the variability of their flux density.