Macroeconomic News Sentiment: Enhanced Risk Assessment for Sovereign Bonds

We enhance the modelling and risk assessment of sovereign bond spreads by taking into account quantitative information gained from macro-economic news sentiment. We investigate sovereign bonds spreads of five European countries and improve the prediction of spread changes by incorporating news sentiment from relevant entities and macro-economic topics. In particular, we create daily news sentiment series from sentiment scores as well as positive and negative news volume and investigate their effects on yield spreads and spread volatility. We conduct a correlation and rolling correlation analysis between sovereign bond spreads and accumulated sentiment series and analyse changing correlation patterns over time. Market regimes are detected through correlation series and the impact of news sentiment on sovereign bonds in different market circumstances is investigated. We find best-suited external variables for forecasts in an ARIMAX model set-up. Error measures for forecasts of spread changes and volatility proxies are improved when sentiment is considered. These findings are then utilised to monitor sovereign bonds from European countries and detect changing risks through time.


Introduction
In the wake of the sovereign debt crisis in Europe, managing and monitoring credit risk arising from sovereign bonds are increasingly important. European countries have undergone changes in terms of their financial stability, and credit spreads have widened due to increased financial risk. Modelling of sovereign bond spreads is often linked to various macroeconomic factors such as the countries' GDP growth rate or inflation. These macroeconomic factors are monitored via scheduled announcements from official bodies e.g., treasuries and national banks but are also covered in news articles and unscheduled announcements. Changes in country dynamics and risks are reported and captured in news, which are classified as "macroeconomic news", and can be closely monitored and quantified through news sentiment analysis.
News sentiment for equities and in particular its use in equity trading have been widely covered in various studies over the last years. An overview of equity modelling and predictability enhancements through news sentiment is given in Reference (Mitra and Mitra 2011). The dynamics of asset prices, in particular their volatility, is clearly affected by news events. These events are classified and quantified, and news sentiment can be utilised to enhance volatility prediction (see, e.g., (Mitra et al. 2009)). Sentiment Analysis is further used to improve trading decisions in equity markets. Firm-specific news sentiment affects the predicted asset return distribution; taking into consideration sentiment values increases the accuracy of a forecast and contributes to improved portfolio decisions as discussed in (Leinweber and Sisk 2011;Mitra et al. 2018), amongst others. In the Fixed Income market, however, news sentiment and its potential influence to bond spreads have just recently become more relevant in the light of electronification of bond trading (Bech et al. 2016) and sentiment lacks thorough investigation in this market. Especially macroeconomic news sentiment for sovereign bond spreads but also firm-specific news sentiment for corporate bond spreads can add value to both monitoring and forecasting of bonds. In this paper, we aim to fill this current gap and provide an extensive study on effects of news sentiment to bond spread predictions. In particular, we investigate the influence of macroeconomic news sentiment on bond spreads and develop a method to improve prediction and monitoring of sovereign bonds.
When analyzing bond spreads of European countries, various studies (e.g., (Bernoth and Erdogan 2012;Caggiano and Greco 2012;Maltritz 2012)) found influencing international and country-specific risk factors such as government debt and characterised market dynamics such as liquidity issues and fiscal policies to effect bond spreads. Economic fundamentals are seen as drivers for sovereign spreads (see (Dewachter et al. 2015)); they have been utilised to explain yield spread movements and a significant effect has been found. Following a study by (Afonso et al. 2015), factors that influence sovereign spreads in Europe are time varying. The authors highlight the fact that financial determinants have changing effects on spreads, but that their influence is increasing in times of crisis. A further investigation of time-varying factors can be done by considering macroeconomic news, which reports on changing dynamics and influences from issuing and neighbouring countries. News and sentiments for sovereign bond spreads were investigated by (Beetsma et al. 2013;Mohl and Sondermann 2013), amongst others. They investigated the influence of news announcements on spreads during the European debt crisis and found evidence that information from government statements as well as news from a European newsflash platform influenced yield spreads both nationally but also across countries, pointing to spill-over effects in the debt crisis.
Our paper contributes to the current literature an in-depth analysis of the impact of processed macroeconomic news and its sentiment towards European sovereign yield spreads. In particular, we investigate the dynamics of daily sovereign spread changes and find a relation between their daily forecasts and news sentiment time series. Our findings show that the forecast of yield spreads can be enhanced when daily news sentiment is taken into account. News is split into positive and negative news items, their influences are investigated separately as well as jointly in a multivariate Integrated Autoregressive Moving Average with explanatory variables (ARIMAX) model. We study various combinations of external variables and give details and results of five model settings, which produce the best forecast outcome. The ARIMAX model gives daily one-step ahead predictions of spread changes and volatility proxies. The in-depth analysis of ARIMAX performance and its improvements through external news variables leads us to propose the enhancement of sovereign bond analysis through including significant news time-series for five European markets.

Bond Data
In order to establish rules to monitor risks of a bond portfolio over time, we analyse sovereign bond spreads of five European countries. We distinguish between short-term bonds with a maturity between three months and five years and long-term bonds with maturities between five and 30 years. We analyse sovereign bond data from Thomson Reuters Datascope (see (Thomson Reuters DataScope 2018) for more details; a subscription is needed; therefore, our utilised datasets cannot be made openly available) and calculate spreads between the bond yields and the AAA-rated bond yield quoted from the European Central Bank (ECB). The spread series which we model is obtained as a spread referring to ECB AAA Svenson yields, see Svensson (1994) for more details.
Our analysis covers data from five European countries, namely Germany, Great Britain, Italy, Spain and France. For each country, we consider both short-term and long-term bonds issued by the countries between 2007 and 2017. The chosen countries reflect the diversity of European economies in that time period. Two of the analysed countries are from the PIIGS group (Portugal, Ireland, Italy, Greece and Spain) which were in the heart of the European debt crisis and exhibited economic downturn. Great Britain and France are considered more stable economies, but both dealt with relevant economic news events such as the Brexit Referendum or controversial elections during the considered time period. Finally, we analyse bonds issued by Germany which is handled as the most risk-free economy within the group. Therefore, our choice of countries reflects various economic situations, the analysis will show how news have adverse effects in different economic settings. The analysed bond data includes more than 300 bonds, and daily closing prices from Thomson Reuters Pricing are utilised. We focus on analysing spreads of single tradeable sovereign bonds, therefore concentrating on analysing a link between macroeconomic news and fixed income products. We would like to highlight that the obtained results can be utilised by risk management models in the Fixed Income domain.

Macroeconomic News Sentiment
We wish to analyse the effect news articles and macroeconomic announcements have on spreads of bond yields. In our study, macroeconomic sentiment comprised by RavenPack is employed (see (RavenPack News Analytics 2018) for more details; a subscription is needed, therefore our utilised datasets cannot be made openly available). RavenPack News Analytics marks every macroeconomic news event in news items from various sources with a sentiment value called "ESS"-event sentiment score. This sentiment value lies between −1 and 1 and quantifies the sentiment of a particular news event for the chosen entity. In our case, we choose the bond issuer as the entity we would like to follow. We create daily news time series out of all sentiment values that stream in over a given day. Our work clearly distinguishes itself from other literature on sovereign bond spreads and their main determinants, since we do not take into account fiscal time series and fundamentals but rather try to analyse a connection between macro-economic news sentiment and bond spreads. One main advantage of this is that we are not limited to scheduled announcements, which are still covered in our news database, and quarterly or semi-annually releases of fundamental figures. On the contrary, news items are observed throughout the day and news sentiment signals are calculated before market closing time. By following these macro-economic news on a daily basis, we get daily macro-economic signals which can be included into daily trading decisions. Analysis on fundamentals can be an addition to our signals; however, in this work, we concentrate on daily macroeconomic news sentiment and its effect on sovereign bond spreads.
We follow macroeconomic news, which are bundled under the entities Germany, Great Britain, Italy, Spain and France, respectively, representing the issuer of the sovereign bonds. A typical macroeconomic news example from our database includes the time stamp, the relevance of the news with respect to the key word (entity) as well as the event sentiment score.
Depending on weekday and time, the news item is mapped to its relevant trading day. Weekend news are shifted to Monday and any news coming in after market closing time is shifted to the next working day. For each news item N i , we have given a time stamp timest which consists of the date and time of the release of the news item, timest(N i ) = (date(N i ), time(N i )), where i = 1, . . . , n and n denotes the number of news items in the data set. We map the time stamp of each news item to a trading day for that news item TrD(N i ) where TrD(N i ) ∈ {TD t }, t = 1, . . . , m. We have that TD 1 ≥ min i (date(N i )) and TD m ≤ max i (date(N i )) and m is the number of trading days in the given time interval [min i (date(N i )), max i (date(N i ))]. With c denoting the market closing time, we set We create nine different time series based on the relevance and sentiment value we receive from RavenPack's database to build daily news sentiment values which are utilized as an input variable for our time series models. Firstly, we split the sentiment values into two sub-categories handling positive and negative news sentiment separately. We conduct a pre-analysis of our news sentiment data which allows us to consider all news after market closing time until market closing time on the following day for the daily news sentiment. We create 1. a mean news-sentiment value time series, 2. a volume of news time series, 3. a news-impact time series, for the three categories a. all news, b. positive news, c. negative news.
We build the volume of news time series V(t) with n denoting the number of news items and t with t = TrD 1 , . . . , TrD m denoting the current trading day, as The mean news-sentiment time series takes into account the event sentiment score "ESS", which is delivered with each news event, −1 ≤ ESS ≤ 1. The mean overall sentiment scores are calculated for each trading day leading to a trading day mean news-sentiment time series. The mean news-sentiment value time series MS(t) is calculated as The news-impact time series takes into account the potential influence decay of a news story. The news items for each working day are weighted keeping in mind that the most recent news item before closing has the highest effect on the closing yield. The other news items, which come in before that, have a decaying importance. The news-impact time series IS(t) with c denoting the closing time of the market is given as The calculation of the news-impact time series was introduced by (Yu and Mitra 2016). The sentiment value ESS is multiplied by a decreasing exponential weight leading to the news impact score I, I = ESS * e (−λ(c−time)) . The closing time of the market c is the reference time, c − time measures the difference between news time time and market closing time and the decaying factor λ is determined through e (−λ(240)) = 1/2. We choose a time span of 640 min, after which news stories only have half of their impact left.
Time series of volume of the news items are examined as well. We count the number of relevant news items for the entity considered for a given trading day. Again, weekend news and news after market closing are shifted to the next trading day. We count all incoming news items (neutral, positive and negative) to create the volume of all news time series and distinguish between positive and negative news sentiment to create a volume of positive and negative news time series. Therefore, we create nine different time series observed throughout the time interval where the bond is active. All news time series are utilized as regressors in a regression model as well as external variables in an ARIMAX model. Furthermore, their correlation with the yield spread is calculated for the whole time period and through a rolling window approach.
When analysing correlations between bond yields and news-sentiment time series, we have to consider potential spurious correlations we might observe due to business cycle effects or common market microstructures. When analysing the volume of news times series, we cannot detect long-term business cycle trends, rather short-term news event effects are captured quite clearly. The news-impact time series put the highest weights to most recent news, news older than 10 h does not have a high weight in the daily time series anymore. Therefore, we focus on capturing short-term macroeconomic news effects and analyse the correlation between bond spreads and these derived macroeconomic news-sentiment series.
In Figure 1, a typical time series of news volume is depicted. Here, we show a volume series for Germany, where we distinguish between positive and negative news items. We can see spikes in the positive and negative volume series, which are analysed in more detail to understand the reason for this dynamic. Increases in positive and negative news are triggered through various events, which might not all be relevant for movements of bond spreads. Since we would like to analyse effects of macroeconomic news on bond spreads, we are less interested in news collected under the topic "social"; instead, we concentrate in the following on news from the broad topics "politics" and "economics". One important event that occurred in the covered time period has been the Brexit Referendum in the UK on 23 June 2016. In Figure 2, we depict the volume and sentiment of news of positive, negative and overall news events for the entity UK between May and July 2016. A clear spike in the volume of news can be detected on and shortly after 23 June. In addition, the overall sentiment shows a decline in this period; however, the sentiment series exhibits less movement during the referendum period. Here, the volume of macroeconomic news is a stronger signal for events than the daily sentiment series. This example highlights the different features of the daily sentiment series, counting the occurring news events which are relevant for an entity is often a good indication of market movements.

Correlation and ARIMAX Models
In order to establish whether a relation between the different news time series and the yield spread changes exists, we test for correlation between the daily spread change series and all nine news time series. We calculate Pearson's correlation between the daily time series and test whether the correlation is significant. Furthermore, the correlation is estimated within a rolling window to see time-varying features of the correlation between time series. We would like to investigate stability and predictive accuracy over time.
By calculating rolling correlations between daily news sentiment series and bond spread series, we establish the correlation through time. We would like to find out if the correlations are more or less stable over time or if we have large variations of correlation values. In particular, we are interested in investigating change points in the rolling correlation series. We assume that a significant change from positive to negative correlations indicates changes in the market environment. These changes can be captured in a regime-switching setting, where market states of the underlying bond spread can be filtered out by estimating the current state of the rolling correlation time series. In the literature, exogenous break points in sovereign bond spread series have been established in our considered time window between 2007 and 2017. The exogenous breaks often mark a division into pre-and post crisis periods (see, amongst others, (Afonso et al. 2015;Caggiano and Greco 2012)). Further methodologies to analyse a changing relationship between news time series and yield spreads over time include the Dynamic Conditional Correlation model by (Engle 2002) and the specific model formulation for leptokurtic data analysed in (Del Brio et al. 2011). In addition, a cointegration analysis can add further value, especially a threshold cointegration where adjustments occur when deviations reach a certain threshold (see, e.g., (Stigler 2010) for an overview and (Tsagkanos and Siriopoulos 2015) for a case study on asymmetric effects between markets and production in Northern and Southern Europe). These methodologies are promising alternatives and will be analysed in future work. We analyse our correlation findings further within a regime-switching setting, where hidden regime switches are estimated by filtering out information on the observed rolling correlation. News and their correlation to bond spreads are then analysed taking into account the current filtered market regime.
Secondly, a linear regression is performed to analyse the effects of news time series on the yield spread changes. All nine news time series are taken as regressors in a variety of combinations. We report here results for regression with three news series regressors, namely the Volume of All News, Positive Impact and Negative Impact.
Lastly, we apply an Integrated Autoregressive Moving Average (ARIMA) model (see (Tsay 2010) for more details) to analyse and forecast bond yields. We additionally add external explanatory variables to the model, therefore fitting an ARIMAX(p,i,q) model to yield spread changes. The ARIMAX(p,i,q) model is given through where d t is the i-th differenced series of the time series r t , {a t } is a white noise series and x lt is the l-th external explanatory variable, l = 1, . . . , m. The explanatory variable are uni-or multivariate. An ARIMAX model was also successfully applied by Apergis (2015) to analyse Credit Default Swaps (CDS) spreads and newswire sentiments. His study results in improved forecast errors when external news time series were allowed. We model the spread changes firstly with an ARIMA(p,i,q) model and compare the resulting in-sample and out-of-sample one-step ahead forecast errors to those which arise from ARIMAX(p,i,q) model with various external regressors. In the following sections, we run a considerable amount of models on our daily yield spread series, taking into account uni-as well as multivariate external explanatory variables. We can improve the forecast errors of analysed bonds when sentiment is taken into consideration. This points to the fact that news sentiment has a value for bond yield modelling and risk assessment. Monitoring macroeconomic news sentiment series in addition to the actual yield spread can lead to early warning signs for unexpected changes in yields or structural changes visible in the yield spreads.

Results and Discussion
We analyse sovereign bond spreads of five European countries and develop a sentiment-enhanced predictive modelling approach. We distinguish between short-and long-term bonds. In our categorization, short-term bonds have a maximum number of days to maturity of 1500.

Correlation and Market Regimes
First, we wish to investigate whether significant correlation can be found between spread time series from long-term bonds and our derived sentiment time series. The state of the markets, changes to it and therefore news that affect market behaviour have an effect on prices and spreads in the Fixed Income market. Up until now, less effort has been put into establishing the link between daily news sentiment and the dynamics of bond spreads.
We would like to analyse the correlation between the daily news sentiment series which were introduced above and bond spread changes. This leads us to restrict which of the news sentiment series shall be incorporated into the prediction of bond spreads. The daily news sentiment series have information on market movements and activities consolidated in a daily signal. We concentrate here on macroeconomic news to underline the effect that these news have on the sovereign risk of the issuer of the bond and therefore also on the bond spread. In equity markets, the question arises if the news or the market is quicker, meaning if news contains information which is not already reflected in the prices. This question is difficult to answer and there might be cases for both scenarios. However, Fixed Income markets in general are less likely to absorb macroeconomic information as fast as equity markets, since bond portfolios tend to be medium-to long-term investments and algorithmic trading decisions play a significant smaller role than in equity markets. We therefore use the daily macroeconomic news sentiment series as a source of information on current market affairs and market changes for the sovereign bonds. A correlation analysis gives us the insight whether a general connection between news signals and bond spread movements exists and, if so, how stable the correlation is and how news series can be utilised for daily predictions.
To conduct the correlation analysis, we create three spread time series, namely the spread series S t , t = 1, . . . , K, the first difference time series of this spread D t , t = 2, . . . , K, and the volatility time series V t of D t , t = 2, . . . , K. Our proxy volatility time series is calculated by taking the absolute value of D t , t = 1, . . . , K. The duration of the bond in days is denoted by K : We denote the benchmark bond to calculate the spread by B t , the yield of the investigated bond is Y t . Time series S jt , D jt and V jt j = 1, . . . , J, t = 1, . . . , K are calculated for all J bonds.
We calculate the rolling correlation between news and spread time series with a window size of 250 days. We analyse the significance of the correlation coefficient for each individual bond in our data set as well as for a mean spread time series M t , t = 1, . . . , N for each considered country. The time window covers the N days including all time intervals from all analysed long-term bonds. The mean spread is given by with n t denoting the number of available spreads at time t. A mean spread is derived separately for each country, so that we can analyse thoroughly the country specifics.

Sovereign Bonds Spreads in Spain and Germany
Firstly, we analyse sovereign bonds issued by Spain. Over the last years, markets in Spain were in turmoil due to the European sovereign debt crisis. We expect to find changing correlations between the news sentiment series and Spanish bond spreads over the last years, especially between the years 2008 and 2015. These changing dynamics might have been influenced by European adjustment programs, e.g., the European Financial Stability Fund (EFSF) (see e.g., (Afonso et al. 2015)). The increased financial risk arisen from the sovereign debt crisis leads to a widening of sovereign bond spreads. The response from the European Union e.g., introducing EFSF had softening effects on correlation between fundamental risk factors and sovereign risk. Turbulent market times are also mirrored in more turbulent times in the news; therefore, we expect to find changing correlations over time.
We analyse the correlation between the Mean Spread series and the News Sentiment series and find significant correlation with "All Sentiment","All Impact" and "Volume of Neg News" and D t . Table 1 shows the percentages of bonds with significant correlations for each sentiment time series with S t , D t and V t for long-term bonds in Spain. The highest percentage of significant correlations arise with Volume of All and Volume of Negative News as well as with the time series on Negative Impact. Furthermore, we depict the rolling correlation analysis on long-term bonds in Figure 3. The upper figure depicts the number of news items for the entity Spain between 2007 and 2017. It can clearly be seen that the news volume steadily increased between 2011 and 2014, a time where the sovereign debt crisis hit Spain, its companies and people. It is not surprising that the number of news increased in that period. Furthermore, the rolling correlation between the mean long-term spread and the news impact series is plotted in the second graph. An increase in correlation between news volume and long-term bond spreads can be detected between 2011 and 2014. It is noticeable that positive as well as negative news volume increased their correlation, both showing a positive correlation in times of crisis and large news volumes. This points to the fact that, overall, the volume is a well suited explanatory variable, highlighting the positive correlation between widening of sovereign spread and news volume. Analysing the news volume correlation time series in more detail, one can see that switching market regimes might lead to a switch in the direction of correlation. As can be seen in the third graph, between 2007 and 2009, the correlation between positive news volume and spread has been negative, indicating smaller spreads when more positive news were detected. However, this sign changed when the state of the market changed, leading to overall positive correlations between news volume and spread, regardless of the tone of the news. Therefore, in bear markets, the importance of the overall news volume is highlighted.  Furthermore, short-term bonds with a duration of less than 1500 days are analysed. In the following, all of the analysed bonds were issued by the Government of Spain between 2007 and 2017. We analyse 20 bonds, the spread, spread change and volatility of the spread series are calculated and their correlations with the sentiment series are estimated. Table 2 shows the percentages of bonds with significant correlations for each sentiment time series with S t , D t and V t for 20 short-term bonds in Spain. We find that the volatility series are typically stronger correlated to the daily news-sentiment series than the spread difference. Sentiment series in this country for short-term bonds can be utilised to improve prediction of daily volatility. The sentiment series, especially the volume series of all news and the "All Sentiment" series, gives an additional insight for one-step ahead predictions of bond spread volatilities.

Market Regime Detection
To underpin the occurrence of changing regimes in the rolling correlation between sovereign bond spreads and the volume of news, we analyse the switching behaviour of the correlation time series. We assume an underlying Hidden Markov Model to find the best suitable state sequence. In a Hidden Markov Model, the market states are modelled through a Markov chain with a given number of states. The Markov chain is not observable but hidden in an observable market time series. Filtering out information of this hidden Markov chain through calculating conditional expected values leads to finding current state probabilities. Observable in the market are bond spreads and news items, their rolling correlation is calculated and taken as the observable time series. The actual state of the market is hidden in this correlation process, which becomes the observation process in our Hidden Markov model. We set the following definitions: Definition 1 (Observation process). Let {y t , t = 1, . . . , T} denote a time series of T univariate observations taking values in a sampling space Y, which may be either discrete or continuous. We consider {y t , t = 1, . . . , T} to be the realizations of the stochastic process {Y t } T t=1 . We assume that the probability distribution depends on the realization of a hidden stochastic process X t . The stochastic process Y t is directly observable.
The process X t is the hidden random process embedded in our observation process. The time series {y 1 , . . . , y T } is observed as realisation of a stochastic process {Y 1 , . . . , Y T }, which is generated by a finite Markov mixture from the distribution family Y t | X t ∼ T (θ X t ). X t is an unobservable ergodic Markov chain with N states.
Consider the correlation series Y t which is modelled within a discrete-time HMM. We assume that Y t follows the dynamics where µ and σ denote the drift and volatility of the time series, respectively, z t are independent and identically distributed (IID) normal random variables (IID) and X t denotes the Markov chain. Both µ and σ are governed by the Markov chain x t .
To analyse this, we fit a three-state Hidden Markov Model to the time series and adapt the Viterbi algorithm (see (Viterbi 1967)) for more details, for finding the optimal state sequence. Figure 4 depicts the estimated market state when analyzing the correlation between the mean bond spread of bonds issued by Spain and the volume of positive news. We can see here that the estimated market states are in line with the actual observed fact of widening credit spreads from 2011 to 2014. Before and after this period, the market in Spain is estimated to be neutral and in a bull state, which mirrors the real market situations in these larger time windows. Additional information on Hidden Markov models and its estimation can be found in (Rabiner 1989) or (Elliott et al. 1995) amongst others. An in-depth analysis of changing market states and their effects on news sentiment influences is the topic of future work.
German bonds, which have been in a stable market within this time period, also exhibit changing rolling correlations over time. We depict the rolling correlation between the mean long-term spread issued by Germany and the volume of news items in Figure 5. It can be seen that, when the credit spread gets narrower and even becomes negative in 2011, the correlation between positive news volume and mean spread changes to negative, an increase in positive news is linked to a decrease in the bond spread. This is true since the Fixed Income market in Germany within this period is stable and not in a bear market.

Analysis of Mean Bond Spreads
In this section, we would like to focus our analysis on mean bond spreads from Spain, Germany, Italy, France and the UK. We consider here the mean spread at any given point in time over all available bond spreads in our database for each of the countries. When analysing the mean spread, we investigate in general the potential effect of including impact values and news volume for each entity as external variables. In particular, the correlation between bonds spread series and news time series shall be analysed, a linear regression is performed and a one-step ahead prediction (in-and out-of-sample) through an ARIMAX model is performed. This gives an understanding about the enhancement of predictions by involving external variables. The external news variables carry information on macro-economic changes and topics in the considered countries. Analysing the mean spread of a given bond portfolio leads to a general view on influence and correlations between macroeconomic news items and Fixed Income markets for a specific country.
First, we analyse the mean bond yield of 53 long-and short-term government bonds in Spain, which were active between 2007 and 2017. The mean bond yield and its first difference are calculated and the rolling correlation between this differenced series and the sentiment time series are estimated. The rolling correlation with a rolling window of 250 days is determined. The volatility proxy, which shows significant correlation with the Volume of News time series, leads to changing correlation patterns over time. We consider here the changing correlation between the volatility of the mean spread of all sovereign bonds from Spain in our dataset, which covers the time period between 2008 and 2017. Figure 6 depicts the rolling correlation between the volatility and the three Volume of News time series. The correlation with the Volume of All News changes from a negative to a positive correlation in times when the volatility increases. Again, when the markets start to be calmer, the rolling correlation value decreases. This pattern can be observed for the other analysed countries as well, leading to the fact that the Volume of All News is utilised as an informative time series for further predictions. In turbulent market times, the correlation between the Volume of All News time series and the volatility proxy fluctuates around 0.2; it decreases sharply when the markets enter a quieter period.  Figure 7 depicts the spread change D t of the mean of the Spain bond yield and the rolling correlation series. The window size is set to 250 days; correlation is calculated for six different time series, the impact values and the volume. Rolling correlation changes in particular when market circumstances change, again pointing to the fact that news have a different impact in different regimes and that shifts can be examined through the correlation with news volume.
We test for significant correlation and find that, considering only one time window, the correlation between the mean spread and the three time series for all news sentiment is significant at α = 0.01. Correlations between the mean spread and the Sentiment and Impact series are negative, correlation with the time series of Volume of All News is positive. Furthermore, considering a linear regression, we find that the Volume of All News is a significant regressor.
Let us furthermore consider bond spreads from Germany, a rather stable economy within the European union in the considered time interval. The number of news covered for this entity between 2007 and 2017 is 360,864. When filtering for news with a relevance >60 and leaving out news from the topic "society", the database has 167,359 news items within this time period which are considered for creating daily news time series. We find significant correlation ρ between V t , the volatility of the spread difference and the news time series All Sentiment (ρ = −0.14), All Impact (ρ = −0.14), Positive Sentiment (ρ = −0.1), Volume of Positive News (ρ = −0.04), Positive Impact (ρ = −0.11), Negative Sentiment (ρ = −0.06) and Negative Impact (ρ = −0.06). All news time series are negatively correlated to the volatility of the bond spread difference, leading to the conclusion that lower sentiment and impact can be observed when higher volatility is present.

Linear Regression for Spread Volatility
Moving on from these correlation findings, we conduct a linear regression where the three time series "Volume of All News", "Positive Impact" and "Negative Impact" are chosen as regressors. The results depicted in Table 3 show that the impact time series are significant regressors at a 0.1% level. The regression results show that the positive and negative impact series are informative input variables for the volatility proxy. Both impact scores have negative coefficient estimates; therefore, positive news impact has a negative effect on the spread volatility, whereas negative impact (which takes values <0) has an increasing effect on volatilities. We receive similar regression results for the experiments on other countries' volatility time series (see Tables 4-7). For the spread volatility from Italy, we receive significant estimates for the volume of all news regressors, for Spain, the UK and France, the negative news impact is a significant regressor. We conclude therefore that derived regressors from macro news sentiment have an effect on these sovereign bond spread volatilities.  Furthermore, we analyse the correlation between the mean bond spread volatilities and the impact scores. We exemplarily depict positive, negative and general impact scores from Italy in Figure 8. We see that the mean level for the positive impact score time series is roughly at 0.4; the mean level of the negative impact score time series is around −0.4. An increase in negative scores therefore means a value closer to zero. In Figure 9, we then depict the rolling correlation between the volatility of the mean bond spread and the impact scores for the five countries.  Figure 9 shows how the correlation changes over time. As was pointed out in the previous section, correlation patterns change and have varying signs in varying market states. By comparing the rolling correlation dynamics in the five European countries, we see that the correlation between the impact values of news and the volatility mainly fluctuates in a range between −0.2 and 0.2. On average, there is a negative correlation between the negative impact scores and the volatility of the news. When the negative impact score rises, the volatility decreases.

Prediction of Sovereign Bond Spreads through News Sentiment
We now turn our focus on the enhancement of predicting future bond spreads through macroeconomic news sentiment. Our previous results on correlation in both static and rolling windows as well as on linear regression with daily news sentiment time series lead us to utilise news sentiment time series as an external variable in an Integrated Autoregressive Moving Average Model with explanatory variables (ARIMAX).
After testing for best suited order parameters, we perform fitting and forecasting an ARIMAX(2,0,2) model on 51 sovereign bonds issued by Germany. We model the spread difference series D t , choose different external variables and compare five model set-ups: ARIMAX Model External Variables ARIMAX 1 No external variable ARIMAX 2 Positive Impact; Negative Impact ARIMAX 3 Volume of All News; All Impact ARIMAX 4 Volume of Positive News ARIMAX 5 Volume of All News An error analysis shows that adding external variables decreases the forecast error in both the in-sample and out-of-sample period. The length of the out-of-sample period is chosen as 15% of the length of the time series. Table 8 shows the empirical error distribution of each model. The lowest median Root Mean Squared Error (RMSE) is reached by ARIMAX 2 (in-sample) and ARIMAX 3 (out-of-sample). Table 8. Error distribution of ARIMAX models for in-sample and out-of-sample window of analysis of sovereign bonds issued by Germany. The highlighted data show the lowest Median RMSE over all models.

ARIMAX 1 ARIMAX 2 ARIMAX 3 ARIMAX 4 ARIMAX 5
In We further analyse that when modelling daily spread differences the lowest RMSE in the in-sample period is gained through Model ARIMAX 2 (in 55% of cases), followed by Model ARIMAX 3 (in 33%). For the out-of-sample period, we find that the lowest RMSE is obtained through Model ARIMAX 2 (in 31%) followed by Model ARIMAX 1 (in 23%). A similar pattern arises when the volatility proxy is predicted.
For the UK, we analyse 100 sovereign bonds, 28 long-term and 72 short-term bonds. We predict the spread changes as well as the volatility with the same model set-up as explained above and find that the lowest RMSE in the in-and out-of-sample set-up is reached by Model ARIMAX 2 (60% and 32% of bonds) and Model ARIMAX 3 (30% and 26% of bonds).
We conclude that, for both countries, the predictions of spread changes from single bonds are best enhanced by Positive and Negative Impact series as external variables as well as Volume and Impact of All News. The forecast error of predicting volatility is also improved when impact series and volume series of all news are taken into account. The risk assessment can therefore be enhanced through adding these macroeconomic news series.

Countries' Mean Bond Spreads
We now perform the ARIMAX analysis on the mean bond spread for each European country we consider. We calculate for each point in time the mean spread value of the bonds available in the dataset of each country. This mean value is taken as a reference bond spread for the country at a given point in time, which reflects the typical movements and therefore risks of the issued sovereign bonds. We hereby consider both long-and short-term bonds.
Furthermore, we compare the performance of the prediction through ARIMAX models in the different European countries. We analyse the model performance by setting the first 85% of the length of each time series as in-sample and the last interval as the out-of-sample period. When modelling the first difference of the mean bond spread, we set the explanatory variable in ARIMAX 4 to the Volume of Negative News and in ARIMAX 5 to All Impact, since these variables have a stronger impact on the prediction than Volume of Positive News and Volume of All News, which are chosen when the volatility proxy is modelled.
First, we calculate the Akaike Information Criterion (AIC) for each model set-up to see whether including external variables result in an improvement compared to the standard ARIMA model when modelling the first difference and the volatility proxy of each country's mean spread. The AIC is calculated for each country and model and the results are presented in Tables 9 and 10; the best AIC value per country is highlighted. We see that, in most cases, adding external variables improves the AIC value of the model. However, the best external variable differs depending on the country. The error analysis of the mean bond spread results in RMSE values can be seen in Tables 11 and 12 for the analysed mean spread change for five countries for in-and out-of sample periods. We consider roughly 50 bonds in Spain, Germany and Italy between 2007 and 2017; for France and the UK, we analyse around 100 sovereign bonds. The results show that including sentiment series as an external variable improves the forecast both in the in-sample and out-of-sample period. Models ARIMAX 2 and ARIMAX 3 are the best performing model set-ups. Choosing the news impact time series as well as the volume of all news as external variables improves the one-step ahead prediction of the spread change. For each country, the prediction of the mean spread change can be improved when news time series are utilised. Our analysis is done furthermore on the volatility proxy; the absolute value of the mean bond spread changes. Here, we find a similar result. Again, using sentiment news time series as an input variable decreases the RMSE of the one-step ahead forecast. Tables 13 and 14 depict the RMSE for the five countries in in-and out-of-sample periods.. We therefore conclude that news time series with our derived daily news signals are a valuable input variable and contain information to forecast spread changes and its volatility. The volatility forecast as a measure for risk is enhanced when daily news sentiment of each country is considered. The risk assessment is more accurate when macroeconomic news is taken into account. We furthermore test whether the coefficients are significant in our model settings for the volatility and whether significant explanatory variables differ for each country. In Tables 15-19, the estimates, standard errors, z-values and probabilities Pr (>|z|) are displayed. We display one table per country. To sum up the results which we receive from the ARIMAX models when modelling mean spread and volatility, we find varying characteristics for different economies. Bond spread dynamics in more stable economies like e.g., Germany are less affected by macroeconomic news, although we find an improvement of the error measures when forecasting the volatility, the external macroeconomic news variables are not significant. There still seems to be value in the covered macroeconomic news; the volatility changes are less strong. When we compare this finding to the UK and France, we find more significant explanatory news variables in our model set-ups. Including the macroeconomic news impact of all variables gives a significant impact on the forecast of the spread volatility in both countries. The two analysed countries from the PIIGS group (Spain and Italy) show the clearest impact of macroeconomic news sentiment to the yield dynamics. Both News Impact series as well as Volume series are significant regressors in the ARIMAX models. Our findings are also in line with findings from (Evgenidis et al. 2017) who see a clear connection between stock market volatility and yield spreads. The movements in yields in Germany are hereby less affected; France exhibits a stronger Granger causality between VIX and the yield spread. Our analysis shows that macroeconomic news sentiment, which impact stock volatility, impact the volatility of bond spreads especially in less stable economies. Political and economical decisions and news events have an immediate effect on sovereign bonds; both policy makers and risk managers should monitor these news to spot changes in the movement and volatility of sovereign bonds.

Conclusions
Our analysis finds significant correlations between aggregated daily macroeconomic news time series and sovereign bond spreads in five European countries. We investigate the behaviour of both long-and short-term bonds and find in most cases significant correlations between the three time series of spreads (yield spreads, spread changes and the volatility of spreads) and daily news sentiment time series. Those news time series take into account either the sentiment of a news item or the volume of the news for a specific entity. We distinguish between all positive and negative news items and found significant correlations between these series and the bond spread. Whether positive or negative news series showed a higher correlation depends on the state of the business cycle. The changing dynamics of correlations are analysed through rolling correlations. We found that changing signs of the correlation between spread and the volume of positive news can be taken as an indicator to detect changing market conditions. In good economic times, volume of positive news is negatively correlated with bond spreads, whereas bear markets seem to generally exhibit positive correlation with news volume time series. Following our correlation analysis, we recommend taking volume and impact news series into account to capture characteristics in changing markets.
We find that the best-suited external variables in our ARIMAX forecast are Positive Impact and Negative Impact daily time series (ARIMAX 2) as well as Volume of All News and All Impact daily time series (ARIMAX 3) which outperform the ARIMAX model without news information for the European countries considered. The RMSE of the one-step ahead forecast is smaller, the prediction of the one-step ahead spread changes and the volatility proxy is improved when external news variables are taken into account. Our findings support earlier results on time-varying factors, since also for news sentiment, correlations vary over time and have changing dynamics depending on the state of the market. We conclude that news sentiment adds value to modelling sovereign yield spreads and should be taken into account when analyzing and monitoring spreads. Risk assessment of bonds is improved when news volume and impact series of relevant entities are monitored; news sentiment impacts spread volatility.
Our analysis further shows that correlation and forecast errors clearly vary through time. We propose monitoring correlation changes over time to recognise changing market conditions as well as to identify relevant external regressors for a one-step ahead forecast. The ARIMAX models show enhanced error measures in both in-sample and out-of sample performance when news time series were taken into account. We are able to forecast growing risk in bond spreads by including news sentiment information.
Future work will cover an in-depth analysis of regressors and their influence on bond spreads. The instrument universe shall be broadened, and corporate spreads shall be investigated. A first outlook confirmed the findings in this paper for other instruments, and an in-depth analysis will be considered in the near future. Furthermore, regime-switching characteristics between news variables and spreads shall be studied in more detail.
Funding: This research is part of the project SENRISK E!10488 supported by funding from Eurostars-2 Joint programme with co-funding from the European Union Horizon 2020 Research and Innovation programme, which we gratefully acknowledge.