Change Point Detection in Terrorism-Related Online Content Using Deep Learning Derived Indicators

: Given the increasing occurrence of deviant activities in online platforms, it is of paramount importance to develop methods and tools that allow in-depth analysis and understanding to then develop effective countermeasures. This work proposes a framework towards detecting statistically signiﬁcant change points in terrorism-related time series, which may indicate the occurrence of events to be paid attention to. These change points may reﬂect changes in the attitude towards and/or engagement with terrorism-related activities and events, possibly signifying, for instance, an escalation in the radicalization process. In particular, the proposed framework involves: (i) clas-siﬁcation of online textual data as terrorism-and hate speech-related, which can be considered as indicators of a potential criminal or terrorist activity; and (ii) change point analysis in the time series generated by these data. The use of change point detection (CPD) algorithms in the produced time series of the aforementioned indicators—either in a univariate or two-dimensional case—can lead to the estimation of statistically signiﬁcant changes in their structural behavior at certain time locations. To evaluate the proposed framework, we apply it on a publicly available dataset related to jihadist forums. Finally, topic detection on the estimated change points is implemented to further assess its effectiveness.


Introduction
In recent years, considerable terrorism-related activity, including propaganda dissemination, recruitment and training, finance raising, and hate spreading towards specific social groups, has been observed in various online platforms [1]. At the same time, several advanced methods have been developed that can analyze online textual content and extract information of interest, such as affiliations towards terrorist groups and information related to terrorist events [2,3]. Such analysis can lead to the identification of key information in the fight against crime and terrorism; for instance, the early detection and analysis of crime-and terrorism-related information exchanged in online communities can promote the efficient resource allocation towards mitigating serious incidents.
The first step in this process is the detection of content of interest, and, thus far, several works have focused on developing effective classification frameworks suitable for distinguishing between terrorism vs. non-terrorism [3] or extremism vs. non-extremism content [2], among others. These methods are more oriented towards detecting suspicious content, but without focusing on the significant changes that take place over time. Such an assessment can be performed using change point detection (CPD) methods applied on suitably constructed time series which can serve as indicators of terrorism or crime activity. More specifically, one can detect significant changes in the time series of posts related to terrorism and hate speech; the position of these changes may reflect changes in the attitude towards and/or engagement with terrorism-related activities and events that trigger users of social media platforms/forums to display a more intense online activity in the vicinity of these time points. Overall, the idea of using a CPD method in time series of terrorismor hate speech-related posts can be seen as an alternative way to identify links between online activity and terrorism.
Towards this direction, this paper proposes a terrorism-related change point detection framework which builds on univariate and multivariate time series. Specifically, this framework facilitates the identification of points in time where statistically significant changes occur regarding the underlying data. By exploiting the temporal evolution of several indicators, such points constitute structural breaks in the behavior of the time series and may indicate the occurrence of important events where attention should be paid to. Moreover, in the case of multivariate CPD, possible correlations existing between the time series of different indicators could also be exploited.
In general, CPD methods are divided into two main categories: online methods [4] that aim to detect changes in real-time and offline methods [5] that retrospectively detect changes when considering historical data. For example, if data consisting of terrorism-related content or hate speech are considered as underlying data for the CPD algorithms, then the estimated change points based on the offline methods could offer a useful statistical analysis of such data to identify patterns and maximize the trade off between correctly identified change points and false alarms, whereas, in the case of online methods, the estimated time locations of structural breaks could enable interested parties (e.g., law enforcement) to respond in a timely manner with the aim of preventing possible radicalization, terrorist or criminal activities. In this work, our interest lies on the offline methods.
Overall, the main contribution of this work is the adoption of a change point detection method to estimate the time locations of statistically significant changes in terrorism-related time series based on a set of indicators for an effective analysis of trends and changes in a criminal context. Specifically, the detection of change points is performed in univariate as well as multivariate time series attempting to exploit possible correlations that may exist between the time series of different indicators. The presence of terrorism-related content and the expression of hate speech are detected on the basis of state-of-the-art deep learning methods (namely, Convolutional Neural Networks (CNNs)) and are used as inputs in the CPD algorithm. The evaluation carried out on data collected from a jihadist forum showcases the appropriateness of the proposed terrorism-related change point detection framework to identify changes at time locations where more attention could possibly be given. The satisfactory performance can be attributed to its ability to detect structural breaks in the time series-either univariate or multivariate-based on the time evolution of their statistical properties. To the best of our knowledge, this is the first time that change point detection algorithms are combined with the frequencies of online textual data classified as related to terrorism and/or hate speech based on well-established classification models.
The remainder of the paper is structured as follows. In Section 2, we present a brief overview of the classification and change point detection methods. In Section 3, we detail the specific setup of the proposed pipeline, whereas, in Section 4, we exhibit its applicability. In Section 5, we discuss the results. Finally, in Section 6, we summarize our main findings, argue on possible limitations of the proposed framework and provide future directions.

Related Work
This section reviews related work, focusing first on change point detection methods and then presenting commonly used text classification methods whose output can be the basis for effectively detecting statistically significant changes in the behavior of a time series.
Change Point Detection (CPD). Regarding the application of CPD methods in online sources (e.g., social media and Surface/Dark Web), most existing works consider Twitter data. Change point algorithms applied to time series related to Twitter posts typically aim to discover the occurrence of events of interest that could be associated with changes in the structural behavior of the time series. For example, a nonparametric method for change point detection via density ratio estimation has been developed for tracking the degree of popularity of a given topic by monitoring the frequency of selected words [6]. Moreover, change points have been detected in Twitter streams using temporal clusters of hashtags in online conversations related to specific events [7]. CPD methods have also been combined with the outcomes of sentiment analysis in Twitter posts where the estimation of change points includes the detection of changes related to significant events [8]. Additionally, three time series produced based on tweets with positive, negative and neutral sentiment, respectively, have been used as input to change point detection towards estimating correlations among the different sentiments [9].
Concerning the use of CPD methods in terrorism-related data, the Noordin Top terrorist network data from 2001 to 2010 have been analyzed to detect significant changes in the evolution of their structure using a social network change detection method [10]. Moreover, a method for multiple change point detection in multivariate time series has been applied in a time series produced by the counts of terrorism events across twelve global regions [11]. Finally, a marked point process framework has been proposed to model the frequency and the impact of terrorist incidents based on change point analysis to search for timestamps where the process undergoes significant changes [12].
In this work, change point detection is identified as a tool to detect changes in the behavior of the time series that may indicate the occurrence of events where attention should be paid to. It is applied to terrorism-related online content, by also considering the presence of hate speech. This is achieved building upon well-established deep learningbased classification models.
Text Classification. The detection of deviant content (such as terrorism-related, extremist or abusive content) in online platforms is often addressed as a classification problem. For example, a content analysis framework has been developed in order to identify extremist-related conversations on Twitter [2]. In a similar direction, focusing on the Islamic State of Iraq and al-Sham (ISIS), content collected from social media sources has been utilized for the automatic detection of extremism propaganda [13]. Finally, a lot of effort has been placed in detecting abusive behaviors in general, such as racist and sexist content [14] or hate speech from content extracted from the white supremacist Stormfront forum [15].
Towards the development of effective classification methods, deep learning has been extensively used, with Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) methods being among the most popular ones. CNNs were originally developed to further improve image processing, resulting in groundbreaking results in recognizing objects from a pre-defined list [16]. Due to their performance in image processing tasks, CNNs gained a lot of attention and were thus subsequently applied in various Natural Language Processing (NLP) tasks, such as text classification or categorization [17], sentiment analysis [18] and machine translation [19]. In addition to CNNs, RNNs have been particularly used in NLP tasks [20]. The main difference between the two lies in the ability of RNNs to process data that come in sequences, e.g., sentences. Specifically, they analyze a text word by word and store the semantics of all the previous text in a fixed-size hidden layer [21]. Detecting terrorism-related content or the expression of hate speech in the online world can constitute an important source of knowledge for early detection of threatening situations (such as manifestation of terrorist attacks). To this end, in this work, commonly used deep learning methods are considered to develop effective text classification models, with particular focus on distinguishing between: (i) crime-and terrorism-related activities (terrorism-related classification model); and (ii) the expression of hate speech (hate speech classification model) that constitutes an indirect way of expressing violence towards a group of people (e.g., minorities). The valuable knowledge that is extracted from both the terrorist and hate speech classification models is used then as the basis of the proposed terrorism-related change point detection framework.

Materials and Methods
Our approach to detect statistically significant changes in content of interest, and specifically in our case in terrorism-related content, involves the following two steps: (i) classification of online material as directly related to terrorism or as containing expressions of aggressive behavior that can be considered as an initial stage which can evolve into something more dangerous (such as crime and terrorism); and (ii) change point detection that could ultimately signify the occurrence of an event of interest. Such an approach will allow the interested parties (e.g., law enforcement) to obtain a more comprehensive and thorough understanding of how crime and terrorism-related activities are carried out and evolve through time. An illustration of the overall framework is depicted in Figure 1.

Classification of Online Material
First, we detail the classification framework developed for organizing content collected from online sources into two predefined sets of categories: (i) related to terrorism or not; and (ii) containing hate speech or not. As discussed, deep learning, and specifically CNNs, have gained significant popularity on NLP tasks, and therefore we opt to use them for our framework; we also experimented with RNNs without yielding any improvement in the overall performance. Specifically, two distinct CNN-based classification models are constructed, i.e., terrorism-related classification model and hate speech classification model, using the same architecture, inspired by Kim [22]; Figure 2 depicts this CNN-based model. Preprocessing. Before feeding any text to the network, a set of preprocessing steps took place to reduce noise. First, we converted the text to lowercase and then removed the hyperlinks, mentions, numbers, punctuation, accent marks, diacritics and short and long words (with <2 and >20 characters, respectively). After that, we tokenized the sequence and performed lemmatization on each term, utilizing the WordNetLemmatizer function of the nltk package (http://www.nltk.org/api/nltk.stem.html?highlight=wordnetlemmatizer; accessed on 18 March 2021).
Embedding layer. The first layer of the neural network architecture is a static embedding layer, which maps each word to a high-dimensional layer. We opted for pre-trained GloVe word embeddings to semantically represent textual content [23]. In particular, we use word vectors of dimension size 100. According to Mikolov et al. [24], 50-300 dimensions can model hundreds of millions of words with high accuracy. We experimented with word vectors of different dimensions, ranging from 50 to 200 and chose 100 due to its efficiency in terms of both performance and time of computation.
Neural Network layer. Various CNN-based architectures were tested and evaluated, by changing the number of CNN layers, filters length and kernel size. In the end, a unique CNN layer was used, since it resulted in the best performance, with 20 filters, kernel size 3 and ReLU as activation function. A 1D average pooling layer was added on top of the convolutional layer to downsample its input, and a flatten layer followed to transform the feature map matrix into a single column. Finally, a dropout layer of p = 0.5 was used and sigmoid was employed as activation function. Regarding the compiling of the model, we used the Adam optimizer with learning rate 0.0001 and binary cross entropy as loss function.
To build the classification models, ground truth annotated datasets are necessary. Next, we describe the datasets used for building the terrorism and hate speech classification models.
Building the terrorism-related classification model. Due to the absence of a wellestablished ground truth dataset that characterizes text as terrorism or non-terrorism related, we constructed the ground truth by combining two widely used datasets: (i) the "How ISIS uses Twitter" dataset available at Kaggle (https://www.kaggle.com/fifthtribe/ how-isis-uses-twitter; accessed on 26 February 2021), which contains ≈17 k tweets from 100+ pro-ISIS fanboys from all over the world since the November 2015 Paris Attacks; and (ii) the "Hate speech offensive tweets" dataset [25], which consists of ≈24 k labeled tweets organized into three classes, i.e. hate speech, offensive and neither. Since this work focuses on analyzing content in English, non-English posts were disregarded.
Overall, to build the ground truth, we considered the first dataset as terrorism-related, while the second one as non-terrorism since it is less likely to contain any terrorism-related content; the latter was constructed by randomly retrieving content from Twitter, based on a set of hate speech-related words. The newly created dataset was split into a train set of 37,973 samples, a test set of 3797 samples and a validation set of 421 samples.
Building the hate speech classification model. To build the hate speech classification model, two datasets were combined: (i) a hate speech dataset that contains texts extracted from the Stormfront [15], which consists of 1190 hate and 9462 non-hate instances; and (ii) the "Hate speech offensive tweets" [25], mentioned above, which contains ≈24 k samples categorized into three classes, i.e. hate, offensive and neither. We considered the "hate" and "offensive" instances as part of the hate class and the rest, labeled as "neither", are used for the non-hate class. Overall, the constructed ground truth dataset consists of ≈35 k samples and was split into training (90%) and test (10%) sets, maintaining the proportion of classes. From the training set, 10% was kept as validation set.
Classification Performance. To evaluate the performance of the proposed classification models, standard evaluation metrics were used, i.e., accuracy, F1-score and the Area Under Curve (AUC) value. For the terrorism-related classification model, the overall accuracy and F1-score are equal to 93%, with 99% AUC ( as shown in Table 1). For the terrorism class, the model achieves F1-score equal to 91%, while the non-terrorism class obtains 94%. For the hate speech classification model, we also achieve 93% overall accuracy and F1-score (Table 2). Moreover, the AUC score is 98%. For the hate and non-hate classes, the F1-score equals 94% and 91%, respectively. Both classification models achieve particularly good performance, compared to other works that also use neural networks for text classification [26,27], which highlights the appropriateness of using them for the categorization of textual data into categories of interest.

Change Point Detection Method
The change point detection (CPD) method applied in this work can take into account univariate as well as multivariate time series and can be used to detect any distributional change within a sequence (e.g., regarding the mean, variance, etc.). The algorithm is called E-Devisive and constitutes a nonparametric approach for CPD in a set of multivariate observations [28].
. . , m} be independent identical distributed samples, where n and m denote the length of each sample. Samples X n and Y m consist of d-dimensional random variables with distributions F 1 and F 2 , respectively. An empirical divergence measure is defined as follows:ε a ∈ (0, 2). For the detection of a single change point, a scaled sample measure of the above divergence measure is defined aŝ Let Z 1 , Z 2 , . . . , Z T ∈ R d be an independent sequence of observations and let 1 ≤ τ < κ ≤ T be constants, where T denotes the length of the time series of observations. The sets X τ = {Z 1 , Z 2 , . . . , Z τ } and Y τ (κ) = {Z τ+1 , Z τ+2 , . . . , Z κ } are defined, and a change point locationτ is estimated as (τ,κ) = argmax (τ,κ)Q (X τ , Y τ (κ); a).
If it is known that at most one change point exists, then κ = T is fixed. To estimate multiple change points, the above technique is iteratively applied. Suppose that k − 1 change points have been estimated at time locations 0 <τ 1 < · · · <τ k−1 < T. These partition the observations into k clustersĈ 1 , . . . ,Ĉ k , such thatĈ i = {Zτ i−1 +1 , . . . , Zτ i }, in whichτ 0 = 0 andτ k = T. Given these clusters, the procedure for finding a single change point is applied to the observations within each of the k clusters. The corresponding test statistic for the kth estimated change point is given by the relationq k =Q(Xτ k , Yτ k (κ k ); a), whereτ k =τ(i) denotes the kth estimated change point located within clusterĈ i and κ k =κ(i) is the corresponding constant. The running time of this iterative procedure is O(kT 2 ), where k denotes the (unknown) number of change points.
For the determination of the statistical significance (p-value) of each change point, a permutation test is implemented under the null hypothesis of no additional change points. First, the observations within each cluster are permuted to construct a new sequence of length T. Then, the estimation procedure is reapplied considering the detection of change points in the permuted observations. This process is repeated, and, after the lth permutation of the observations, the test statisticq Overall, the change point detection algorithm is implemented via the following procedure, as illustrated in Figure 3. At first, the time series is segmented into two clusters C 1 , C 2 based on the time locationτ that maximizes measureQ. Then, it is determined whether the estimated change point at timeτ is statistically significant or not, via a permutation test. If the estimated change point is not statistically significant, it is concluded that there are no change points in the time series of interest. However, if the estimated change point is statistically significant, the time series is divided in two clusters of observations and the previous step is re-applied in each of these two clusters. The above-mentioned procedure is iterated in each of the clusters that is created based on the statistically significant change points that are detected, and the algorithm is terminated when no additional statistically significant change points are derived.

Dataset for Evaluation Purposes
In order to showcase the applicability of the proposed framework, we relied on the Ansar dataset (https://www.azsecure-data.org/dark-web-forums.html; accessed on 7 April 2021), a publicly available dataset containing terrorism-related posts. More specifically, Ansar is a collection of posts published in the Ansar AlJihad Network, a set of invitation-only jihadist forums in Arabic and English that are known to be popular with Western Jihadists [29]. The English portion of the dataset, referred to as Ansar1, contains 29,492 posts and spans the period 8 December 2008-20 January 2010. The dataset contains some Arabic posts, which were disregarded; after this filtering, its size equals 24,130 instances.

Results
This section illustrates the applicability and performance of the proposed terrorismrelated change point detection framework, when applied to the Ansar1 dataset.
Extraction of indicators based on the constructed classification models. As already mentioned, both terrorism-and hate speech-related indicators are used as input to the proposed terrorism-related change point detection framework. To this end, we first exploit the classification models presented in Section 3.1, i.e., the terrorism and hate speech ones, to characterize texts as belonging to the terrorism or non-terrorism class and containing hate speech or not. The output is then exploited by the change point detection algorithm to ultimately detect previously unknown change points in the related time series that probably signify the occurrence of events of interest.
Time series. Overall, two time series are constructed and used as input to the CPD algorithm: (a) the time series of posts classified as terrorism related; and (b) those identified as containing hate speech. The posts are aggregated on a daily basis resulting in two time series with length T = 408 (days), which are presented in Figures 4 and 5, respectively. These time series seem to evolve in a similar way, although the frequencies observed at the time series of the terrorism-related posts are much higher.  Change Point Detection in the Univariate Case. The CPD method presented in Section 3.2 is applied to each of the two above mentioned time series and estimates changes in the mean value of the considered data. For the implementation of the method, we set a = 1 and use R = 499 permutations for the estimation of the statistical significance of each change point with a level of p = 0.05 in our significance testing. The results regarding the time series of terrorism-related posts are presented in Table 3 and graphically depicted in Figure 6a, whereas for the time series of hate speech, the results are presented in Table 4 and Figure 6b.
Considering the time series of terrorism-related posts, six change points are estimated as statistically significant (see Table 3), whereas, when the time series of posts including hate speech is considered, there are three estimated change points (see Table 4).   Change Point Detection in the Multivariate Case. Apart from applying CPD on the univariate case, as performed previously, we can also exploit possible correlations that may exist between the two time series using the multivariate CPD. To this end, we combine the two time series of the terrorism related posts and hate speech into a single two-dimensional time series Z 1 , Z 2 , . . . , Z T , T = 408, Z i = (z i,1 , z i,2 ), i = 1, 2, . . . , 408, where the first entry of the observation vector Z i (i.e., z i,1 ) is the frequency of the posts classified as terrorism-related and the second one (i.e., z i,2 ) denotes the frequency of the posts classified as containing hate speech. The attempt to combine terrorism-related posts with hate speech lies on the idea that hate speech, in the sense of expressing aggressive behaviors, may be related to terrorism and vice versa. This is especially true if we consider the fact that the underlying dataset is based on jihadist forums where terrorism-related topics of discussion and the expression of aggressive behaviors may be more often. The results of the two-dimensional CPD are presented in Table 5 and depicted graphically in Figure 7. It is observed that the estimated change points in the two-dimensional case are the same with those estimated for the univariate time series of terrorism-related posts and presented in Table 3, apart from the point at time t = 237. The estimated change point at time location t = 237 is close enough to the second estimated change point regarding the time series of posts classified as hate speech (see Table 4). Moreover, this point (i.e., t = 237) also appears to be a statistically significant change point for the time series of terrorism-related posts, if the value p = 0.1 is used for the level of our significance testing. Overall, it seems that the time series of terrorism-related posts have more impact on the two-dimensional model compared to the time series of posts related to hate speech. Regarding the estimated change points in the two dimensional time series (Figure 7), some conclusions could be inferred the time locations of the points and the terrorist incidents that occurred during 2009 (a list of widely known terrorist attacks can be found for example: (a) at https://en.wikipedia.org/wiki/List_of_terrorist_incidents_in_2009 ; accessed on 22 April 2021, (b) at https://www.dni.gov/nctc/index.html; accessed on 22 April 2021, or (c) in [30]), which covers the main part of the Ansar1 dataset. It can be argued that the time period between the estimated change points at time locations t = 77 (23 February 2009) and t = 119 (6 April 2009) appear to have an increasing trend, which is depicted more obviously in the frequency of posts which belong to the terrorism-related class. Therefore, the first estimated change point at t = 77 signals an upward change regarding the frequency of posts classified as terrorism-related and as containing hate speech, probably due to the terrorist incidents that occurred at that time. Commenting on the period that is formulated between the second estimated change point at t = 119 (6 April 2009) and the third one at t = 148 (5 May 2009), it can be argued that even more intense online activity (i.e., the trend is even more increasing) is observed compared to the previous period. This may be partially interpreted based on two factors: (a) the terrorist incidents that occurred in the previous period (e.g., Bomb explosion in Afghanistan on 25 March 2009 and Suicide bombing in Pakistan on 27 March 2009) caused an increasing trend related to the aftermath of the attacks; and (b) other terrorist attacks took place in the period delimited by the second and third estimated change point, which enhanced the online activity. Therefore, the second estimated change point at t = 119 signifies an upward (and sharper) change compared to the previous period.
Regarding the period which is bounded between the third and the fourth estimated change point at time locations t = 148 (5 May 2009) and t = 216 (12 July 2009), respectively, it can be argued that the frequency of posts appears to have a stable trend at a high level compared to the previous periods. This stable trend at high frequencies may be partially explained by the two factors that are also mentioned above, i.e., the terrorist incidents that occurred in the previous period triggered an online activity that lasts and is related to the aftermaths of the attacks, and the additional terrorist incidents that occurred in the period between the third and fourth estimated change points preserved the online activity related to terrorist topics and hate speech at a high frequency level. Therefore, the third estimated change point at time t = 148 signals the beginning of a period with stable trend at high frequencies.
A similar interpretation of the results, as the one derived for the time period between the third and the fourth estimated change points, can also be used for the period between the fifth and sixth estimated change points at time locations t = 237 (2 August 2009) and t = 346 (19 Nvember 2009), respectively. In addition, the fourth estimated change point at time t = 216 signals the beginning of a short period with a decreasing trend between the two periods of stable trend at high frequencies. Finally, the two last change points estimated at time locations t = 346 (19 November 2009) and t = 373 (16 December 2009) signal the beginning of two periods with decreasing trends regarding the frequency of terrorism-related posts and hate speech, indicating partially that the interest of users among the forum has been decreased regarding terrorism-related topics.
Topic Detection. To further evaluate the effectiveness of the proposed framework, we proceed with an analysis of the topics discussed within different time periods based on the detected change points, as listed in Table 5. Specifically, we follow the Latent Dirichlet Allocation (LDA) topic detection process in each resulting time period. LDA is a generative statistical model that aims to find distinct topics in document collections [31]; to this end, it models each document as a mixture of latent topics, where a topic is described by a distribution over words. We apply the gensim version of the LDA method (https: //radimrehurek.com/gensim/models/ldamodel.html; accessed on 5 May 2021). The specific parameters used for the LDA model are listed in Table 6. For the topic detection, we focused mainly on the time periods where a more intense online activity is observed either via the existence of an increasing trend (6 April-5 May 2009) or via the illustration of a stable trend at a consistently high level (5 May-12 July and 2 August-19 November 2009) regarding the frequencies. For each of the aforementioned time periods, we ran the LDA method for a range of topics between 2 and 10 in steps of 1 and concluded that at most five topics resulted in a clear set of distinct topics. The results are presented in Table 7. Regarding the first time period (6 April-5 May 2009), which signals the intensification of the posting activity, we observe that the attention is highly focused on destructions and deaths related to terrorist attacks. This is in line with a set of terrorist incidents that took place in the previous period and, as a result, they may have attracted the attention of people, leading to increased online activity and intense discussions around them.
Moving on to the next time period (5 May-12 July 2009), where online activity remains at consistently high rates, there is a continuation of the discussion regarding the aftermaths of the terrorist incidents, as well as new ones that took place during this period (e.g., 20 June 2009 Taza bombing with at least 73 deaths and more than 200 injured (https://en.wikipedia. org/wiki/2009_Taza_bombing; accessed on 22 April 2021). Now, the discussions are more oriented around the government and the military, as well as the arrests and evidence found. As expected, discussions about injuries and deaths continue with undiminished interest. Finally, there is an increased interest and discussion around issues of religion that have often been linked to terrorist attacks.
In the following short period (12 July-2 August 2009), although there is a decrease in the intensity of the discussions that take place, the attention remains on the same points with respect to the previous time period. During the last presented time period (2 August-19 November 2009), which indicates the final resurgence of interest, discussions are also beginning to focus on issues related to security, education and protection. As expected, there is insistence on discussions related to religion and god, as well as, clearly, to the deaths and killings that have occurred in the recent past.

Discussion
Overall, the idea of using the change point detection method in the time series of posts related to terrorism and hate speech lies on the fact that the estimation of statistically significant changes in time series at certain time positions may indicate the occurrence of events at these times that should be paid attention to. These events can be related to well known terrorist incidents that trigger users of social media platforms/forums to illustrate a more intense online activity regarding these incidents and their aftermaths, as in our case.
Based on the results of the change point analysis regarding the retrospective detection of change points in the time series of terrorism-related posts and those containing hate speech (as presented in Section 4), some more conclusions could be inferred. At first, it can be argued that the intensity of online activity seems to be aligned with the intensity of terrorism or crime incidents to a great extent. This conclusion seems to be enhanced by the fact that, during the periods where increasing trends are depicted considering the online activity (i.e., 23 February 2009-6 April 2009 and 6 April 2009-5 May 2009) or the activity is stable at a high frequency level (i.e., 5 May 2009-12 July 2009), a considerable amount of terrorist incidents took place worldwide. Moreover, it is derived that the estimated change points associated with the increasing trends partially coincide with the time locations of terrorist incidents. This is the case for example regarding the estimated change points at times t = 77 (23 February 2009) and t = 119 (6 April 2009) where both of them signify the beginning of periods with increasing trends.
Finally, regarding the topic detection and its results, the analysis of the most popular topics discussed in the periods of greatest interest confirms the suitability of the proposed change point detection method for a better understanding of the trends around topics of interest, as well as the identification of patterns.

Conclusions
In this study, a change point detection framework was adopted to retrospectively detect statistically significant changes in underlying data in the context of terrorism-related activities. Specifically, a nonparametric approach was followed and applied to univariate and multivariate time series, enabling the exploitation of possible correlations that may exist between the time series of the different indicators. The proposed framework was applied on a real world dataset to display its potential in effectively detecting such changes. Both terrorism and hate speech related indicators were considered as input to the terrorismrelated change point detection framework.
Based on the results of the application, it can be derived that the proposed framework could be seen as an alternative way to identify links between terrorism and online activity, since the estimated change points in the time series of frequencies are partially connected to the time locations of terrorism incidents. This implies that criminal/terrorist events trigger users of social media platforms/forums to illustrate a more intense online activity. However, depending on the forum and the users, the illustration of more intense online activity regarding terrorism or in general criminal activities may precede the occurrence of events, and in this case the proposed framework can serve as a means of early warning.
Some limitations apply in the current work. First, there is the difficulty of finding available annotated datasets, especially when focusing on the terrorism-and crime-related context. The lack of appropriate datasets in the domain prevents the comparison and cross-validation of the proposed approach in different settings. What is more, the focus on English content affects the generalization of the results. Finally, the time complexity of the CPD algorithm is quadratic in the length of the time series. In this respect, more time efficient CPD methods could be applied for very large time series.
As future work, we intend to also apply online change point detection methods in terrorism-related data, which may serve as a tool for detecting the onset in radicalization or criminal activities in real time. Moreover, additional indicators could be extracted and fed to the multivariate change point detection method, for instance, the sentiments or emotions expressed towards an event of interest.