A New Statistical Approach to the Optical Spectral Variability in Blazars

We present a spectral variability study of a sample of about 25 bright blazars, based on optical spectroscopy. Observations cover the period from the end of 2008 to mid 2015, with an approximately monthly cadence. Emission lines have been identified and measured in the spectra, which permits us to classify the sources into BL Lac-type or FSRQs, according to the commonly used EW limit. We have obtained synthetic photometry and produced colour-magnitude diagrams which show different trends associated with the object classes: generally, BL Lacs tend to become bluer when brighter and FSRQs become redder when brighter, although several objects exhibit both trends, depending on brightness. We have also applied a pattern recognition algorithm to obtain the minimum number of physical components which can explain the variability of the optical spectrum. We have used NMF (Non-Negative Matrix Factorization) instead of PCA (Principal Component Analysis) to avoid un-realistic negative components. For most targets we found that 2 or 3 meta-components are enough to explain the observed spectral variability.


Introduction
Optical spectroscopy of blazars has been classically used to classify objects between two major categories: BL Lacs and FSRQs.However, it is well known that spectral characteristics may vary, yielding different classifications of the same object depending on the activity level.It is expected that several mechanisms may contribute in different proportions to the optical emission: synchrotron radiation from the jet, disk and broad line region (BLR) emission is distinct from the host galaxy stellar population.The goal of this work is to investigate which mechanisms contribute most to the observed optical emission by looking at the long-term variability of the spectral shape and luminosity.Based on a large dataset of spectra obtained during a period of almost 10 years we used a statistical approach to search for the minimum number of components capable of reproducing the observed variability.

The Data
Data were retrieved from the public archive of the ground-based observational support program to the Fermi mission (http://james.as.arizona.edu/~psmith/Fermi/),conducted at Steward Observatory (University of Arizona).Data consists of a temporal set of low-resolution spectra, extending from 4000 Å to 7500 Å, obtained with the SPOL spectropolarimeter at the 1.5 m Kuiper and 2.3 m Bok telescopes [1].We have analysed all flux-calibrated spectra from 2008 to April 2015 1 .Our dataset consists of about 5000 spectra, belonging to 25 blazars.

Spectral Characterisation
Our first approach was to obtain a representative spectrum of each target, computed as the median value for each wavelength.Three different classes of objects can be identified according to the features observed in the representative spectrum: i-those showing prominent emission lines (see Figure 1), or FSRQ type; ii-those with a featureless spectrum (see Figure 2), or BL Lac type; iii-finally those dominated by the stellar population (see Figure 3), or host galaxy type.Spectral features can be reliably identified and measured.We have sorted the objects of our sample into BL Lac and FSRQ, applying the standard criterion EW rest < 5 [2] to the representative spectrum.The full sample and classification will be published elsewhere.Nonetheless, the object classification is not fully meaningful, given the strong observed variability.As an example, we show in Figure 2 the variation of the optical spectrum of BL Lac.It can be noticed that emission lines emerge only at low activity stages.

Synthetic Photometry
We have computed synthetic photometry from all flux-calibrated spectra.The range covered by the spectra nicely spans the transmission curves of the SDSS g filter in the blue and r in the red (see Figure 4).First we have built a colour-magnitude diagram including the whole dataset for all objects in our sample (see Figure 4).It can be seen that BL Lacs are generally redder than FSRQs, showing larger flux variations but similar colour changes.For those objects where the stellar population dominates the spectrum, the colour varies whereas the flux remains almost constant.Regarding the brightness range, the more distant objects appear as the fainter ones, as expected.We have also looked at the brightness and colour variations with time.The observed behaviour depends on the object class: in general terms, FSRQs exhibit a trend to become redder when brighter (RWB).In contrast, BL Lacs tend to become bluer when brighter (BWB).However when FSRQs get very bright, they change to show a BWB trend (see Figure 5).This fact was recognized by [3] and referred to as a "saturation effect".This can be explained by two different contributions in the optical: thermal emission from the accretion disk dominant in the blue region and the jet emission (variable) in the red one.At low flux levels the spectrum is blue and dominated by the disk, then it becomes redder when the red emission from the jet raises up to a certain level, when "saturation effect" takes place.Above this level, the jet emission dominates and the behaviour resembles those observed in BL Lac, i.e., BWB.This behaviour is explained by the dominance of a third variable and blue color component [4].The same effect can be seen in BL Lac-type objects (see Figure 6), although the magnitude when the effect occurs is less well determined, in some cases only upper limits can be established.

Spectral Decomposition
Given the availability of a large number of spectra obtained for each object, it is meaningful to adopt an statistical approach to search the relevant physical mechanisms capable of explaining the observed variability.It is expected that a combination of a few physical components (synchrotron, accretion disk and stellar population emission) can reproduce the observed flux and colour variations.For that purpose we have explored a mathematical technique called Non-Negative Matrix Factorization (NMF).
The NMF method: Techniques such as PCA (Principal Components Analysis) have been used to decompose a multivariate dataset into a set of successive orthogonal components that explain the maximum amount of variance [5,6].The goal is to project the data to a lower-dimensional space that preserves most of the variance.The main inconvenience of PCA is that components can be positive or negative, which is meaningless in many physical cases.Instead, other techniques such as NMF can be applied to ensure positive components.NMF has been applied in face recognition, genes pattern discovery [7] and to spectral data analysis in astronomical context [8].Mathematically, the problem corresponds to factorising a matrix into two matrices of lower dimension and with positive values A W × H.A main hypothesis of PCA or NMF decomposition is that each spectrum can be decomposed into a linear combination of a certain number of components.
We have applied the NMF method for objects in the sample with at least 50 observations.The minimum number of components has been selected based on various criteria: the capability to reproduce observed spectra, to avoid components with similar spectral shape as well as the smoothness of each component.For most cases we found a minimum number of 2 or 3.Here we present two examples of spectral decomposition using NMF: BL Lac and Mrk 501 (Figures 7 and 8).In the case of BL Lac, the best results are obtained with three components: the red one can be identified with the synchrotron emission extending to radio frequencies, whereas the green and magenta ones may correspond to disk emission or to another synchrotron peaking in the UV.In the low state the most intense component is the green one, whereas in the high state the red one dominates.In the case of Mrk 501, the best results are obtained with two components: the magenta one could be identified with a variable synchrotron emission, whereas the red one is dominated by the stellar population, which does not vary noticeably.Caveats: It is worth mentioning that it is possible that not all components found by the NMF decomposition correspond to a physical mechanism.Furthermore, each component is assumed to remain spectrally constant, which may not be always true, for example when the peak of synchrotron emission is within the optical range and moves along the wavelength.In these cases, an intrinsic variation in the spectral shape has to be reproduced by more than one component.On the other hand we plan to explore further the uniqueness of the components, NMF is an iterative process and there could be more than one combination of components capable of reproducing the observed variability.

Figure 1 .
Figure 1.Median spectra computed for two FSRQ type objects: 3C 454.3 and PKS 1510-08.The median spectrum is shown as the dark blue line, red and green lines represent the spectra at the maximum and minimum levels of activity.The light blue area indicates the spectral shape variation within the first and third quartiles.The features observed around 6900 Å and 7200 Å are due to uncorrected telluric absorption.There is also clear fringing at wavelengths redder than 6900 Å, due to the use of a thinned CCD.

Figure 2 .
Figure 2. Median spectra computed for two BL Lac type objects: OJ 287 and BL Lac.Curves represent the same as in Figure 1.

Figure 3 .
Figure 3. Median spectra computed for two objects dominated by the stellar population emission.Curves represent the same as in Figure 1.

Figure 4 .
Figure 4. Left panel: A sample spectrum to show the spectral range, overlaid with the transmission curves of the Sloan filters g' and r'.Right panel: Average colour g-r of the objects in our sample.

Figure 5 .
Figure 5. Colour light curves for two FSRQ type objects, built with the synthetic photometry data obtained from the Steward Observatory spectra.The bottom panel is a colour-magnitude diagram containing all data points.

Figure 6 .
Figure 6.Colour light curves for two BL Lac type objects, similar to Figure 5.

Figure 7 .
Figure 7. Non-Negative Matrix Factorization (NMF) decomposition of BL Lac spectra.Top panels show the decomposition of an observed spectrum (black), as the sum (cyan) of several components, shown with its corresponding scaling factor (red, magenta and green).Bottom panels show residuals.Right and left panels correspond to low and high flux states, respectively.