#### 3.1. Data Sources

This paper uses Brent crude oil price data (USD/barrel) from 1 January 2013 to 31 August 2018 as empirical data. (EIA, Energy Information Administration), which covered 1447 observations. We select the data from 1 January 2013 to 31 December 2017 as training and modelling data (a total of 1275 data points), and data from 1 January 2018 to 31 August 2018 as test data (a total of 172 data points) to explore the relationship between international oil prices and web text. It should be noted in advance that, unless otherwise specified, the following data results are derived from the results of the test data test.

Based on the above price data, we draw a time series diagram which describes how oil prices fluctuate over time, are shown in

Figure 1.

As shown in

Figure 1, the Brent oil prices have undergone significant fluctuations. Therefore, it is of great importance to forecast oil price fluctuations using an appropriate method. In order to reflect the fluctuation of crude oil price more clearly, we made a statistical analysis of Brent crude oil price data summarized in

Table 4.

As shown in

Table 4, the average of Brent oil prices is 71.38, which means oil prices fluctuate around 70-value horizon. The highest oil price is 118.9, while the lowest is 26.01. There is a big gap between the maximum and minimum price and the standard deviation is 26.39, which means that oil prices fluctuate violently.

In terms of web text, we use Python, JavaScript, AJAX, and other technologies to acquire web text based on 20 oil price-related keywords such as “oil price” and “oil market”, from reliable on-line media such as Reuters (

http://www.reuters.com/) and UPI (

https://www.upi.com/). We have obtained 107,298 documents with a total of 38,075,959 words and after text pre-processing, data extraction, and data alignment, 47,808 documents remained available with 17,494,162 words, covering documents released from January 2013 to August 2018. The data capacity is 10 G. The relevant information is shown in

Table 5.

#### 3.3. Choice of Oil Price Forecasting Model

There are many oil price prediction models which can mine different kinds of information from oil price from different perspectives. Before we begin to analyze the relationship, we choose a model that can better explain the relationship between oil price and text sentiment, evaluating this by forecasting performance. According to the introduction in

Section 2.3, we select Ridge, Lasso, SVR, BPNN, and RF for testing. Since there are hyperparameters in each algorithm, manually adjustments are unavoidable. After more than 2000 attempts, the best results are selected for comparison and analysis. As for the text features, we choose

$compoun{d}_{t}$ which expresses the comprehensive sentiment of the article as the text sentiment feature.

As can be seen from

Figure 3, these algorithms exhibit high accuracy, and have a high degree of fit between oil prices and offer good reliability. To compare the results of these algorithms, the error is measured by RMSE (root mean square error), MAPE (mean absolute percentage error), and the accuracy is thus assessed. The EV (error variance) is used to measure the stability of the predicted results [

45,

46]. The three statistical quantities are defined in Equations (12)–(14):

where

$N$ is the number of samples,

${y}_{i}$ is real oil price,

${f}_{i}$ is the predicted oil price,

${e}_{i}$ is the difference between the real, and predicted, oil prices and

$\overline{e}$ is the mean of

${e}_{i}$ for all samples.

Table 6 shows a comparison of the several algorithms on RMSE, MAPE, and EV. From the numerical value, it can be found that the error and error variance of SVR and RF are relatively large, and it does not offer a good prediction performance. The gap between BPNN, LASSO, and Ridge is not large, especially between BPNN, LASSO, and Ridge. BPNN has a certain advantage therein: its RMSE can be below 1.19, showing higher accuracy, while its lower EV indicates higher stability in prediction.

In addition, from the relationship between the nature of the model and the predictive performance, it can be seen that the relationship between oil price and web text tendency information is quasi-linear: highly non-linear models, SVR and RF, do not offer good predictions, while the two modified linear models of LASSO and Ridge are better. BPNN is well-fitted with a flexible web form and price forecasts made therewith are excellent, therefore, the subsequent analysis of the relationship between web information and oil prices is performed using BPNN as a predictive model.

#### 3.4. The Effect of Comprehensive Text Sentiment

This section analyses the comprehensive score of the text $compoun{d}_{t}$ on the performance impact of oil price forecasts.

It is well known that news is time-sensitive, and people’s cognition of events is also time-sensitive. It takes time to digest a report to its effect on oil prices. After digestion, the information will not have an obvious long-term impact unless time is allowed for maturation, therefore, what needs to be considered here is to forecast the price of oil. It is better to use the news sentiment of the previous few days. In such a time series, it is necessary to know how many lag steps are optimal: here, the first step is delayed which indicates the sentiment to use text from yesterday, the second order represents the sentiment to use text from yesterday and the day before yesterday, and so on. Here, RMSE is selected as an indicator to measure accuracy, and different lags of web information tend to support the performance of oil price forecasting, while EV is used as an indicator to measure stability.

The first comparison is RMSE: according to

Figure 4, when the text is not used, regardless of the lag order of the web information, the RMSE is 1.40. In contrast, once the web information tends to be used, the RMSE decreases significantly, with a drop of at least 0.2. In different orders, the prediction error also exhibits a certain difference. After the third order, it reaches the lowest level and can drop to 1.08. In the fourth order, the accuracy will decrease, and the RMSE will increase by 0.08 compared with the third order. The reason is that the information is overloaded, and information from four days ago will interfere with the oil price forecast.

Then we compare the EV. According to

Figure 5, when the text sentiment is not used, the EV is 1.64 regardless of the lag order of the web information sentiment, and the error variance decreases by about 0.2 after using the text sentiment, suggesting that, after using the text sentiment, the stability of the prediction is improved, indicating that the web text information plays a role in stabilizing the prediction results and correcting them. Furthermore, it can be found that the degree of lag is not particularly significant to the stability of oil price predictions.

In summary, after using the web text sentiment, the accuracy and stability of oil price predictions can be further improved. The RMSE can be reduced by up to 0.4, and the EV can be decreased by 0.2. Using different lag-level text information, the accuracy will be different. The advantage is that the use of the text sentiment of the third-order lag to predict prices can maximize the accuracy of the prediction; however, adjusting different text sentiment lag steps cannot lead to further changes to the stability of an oil price prediction.

#### 3.5. The Effect of Different Types of Text Sentiment

It can be seen from

Section 3.4 that the comprehensive sentiment of the text has a relatively large positive effect on the performance of oil price prediction, and the accuracy and stability have been improved. Some studies have pointed out that negative information will have a greater impact on oil prices, and the extent of the specific improvement is unclear. This conclusion does not serve the oil price forecast very well, and thus we now conduct a more in-depth analysis.

As mentioned in

Section 2.2, VADER can be used to extract the sentiment of the three angles of

${\mathrm{negative}}_{\mathrm{t}}$,

${\mathrm{neural}}_{\mathrm{t}}$, and

${\mathrm{positive}}_{\mathrm{t}}$ in the text. We now put these three factors into the oil price prediction model using a BPNN, make the predictions, and assess the difference in performance of oil price forecasts by placing different propensity information into the forecasting model and the result is shown in

Figure 6.

First, we analyze the RMSE: as long as the text sentiment factor is added, no matter what its type, the accuracy of the prediction can be improved. Secondly, it can be found that, as long as the propensity information is added, no matter what its type, the difference in accuracy is not large, and it can even be considered as a random error. Moreover, regardless of the amount of information added, the addition of sentiment information, and the addition of multiple propensity information, the difference remained small.

Then, we analyze the EV according to

Figure 7: as long as the text sentiment factor is added, no matter what its type, the stability of the prediction can be improved. Secondly, it can be found that as long as the propensity information is added, no matter what its type, there is little difference in stability, and it can even be considered to be a random error. Moreover, regardless of the amount of information added, the addition of sentiment information, and the addition of multiple propensity information, the difference remains small, therefore, it can be considered that, as long as text sentiment information is added, the accuracy and stability of the prediction can be improved, and there is no significant relationship with the type of sentiment. Adding more types of sentiment information does not further improve the prediction performance. Here, the sensitivity of oil prices to negative information is not fully reflected.

#### 3.6. The Effect of Text Sentiment with Different Strength

Generally speaking, only when there are more prominent events, will the text show an obvious sentiment. In terms of the oil price, it will only respond to major events, therefore, when the oil price is predicted through analysis of on-line text, a correction that is more conducive to oil price forecasts arises as explored in this section.

Figure 8 demonstrates the distribution of errors for samples with different propensities and different propensity strengths, and key statistical features are listed in

Table 7. “Support” indicates the degree of support, and the number of days of the daily text sentiment value falling within the interval: only when the degree of support is high enough, is the statistical feature value sufficiently reliable. The mean error indicates the mean of the error within the corresponding interval, and the variance of error indicates the corresponding interval. The last four columns respectively represent the ratio of the data points under the error greater than the specific value for that level.

For compound tendencies, Levels 1 to 5 in

Figure 8 and

Table 7 correspond to intervals [–1, –0.6), [–0.6, –0.2), [–0.2, 0.2), [0.2, 0.6), and [0.6, 1]. When

$compoun{d}_{t}$, is at Level 3 or 4, more bad cases will appear, that is, there will be many extreme error points compared with the case at Levels 2 or 5. Similar outcomes can also be seen in

Table 7. It can be seen that the proportion of the bad cases at Level 3 is the highest, the point with errors greater than 2 accounts for nearly 10% of all points at this level, and that with an error greater than 5 may still be found, accounting for 0.13% of all points, indicating that the degree of error is very high. Compared with Level 3, which has the same high level of support, it is much better than Level 4, and the number of bad cases predicted decreased somewhat. In terms of comprehensive performance, the degree of support at Levels 1 and 5 is not considered because it is too small. It can be seen that the average error at Level 3 is 0.91, and that at Levels 2 and 4 is less than 0.9, showing a decrease of about 0.1, which means that, when

$compoun{d}_{t}$ is in a larger or smaller position,

$compoun{d}_{t}$ is more conducive to oil price forecasting.

For the negative sentiment, Levels 1 to 5 in

Figure 8 and

Table 7 correspond to the intervals [0, 0.2), [0.2, 0.4), [0.4, 0.6), [0.6, 0.8), and [0.8, 1], respectively. When

$negativ{e}_{t}$ is at Levels 2 and 3, more bad cases will appear, that is, there will be many extreme error points. In contrast, at Level 1 they are fewer in number. The proportion of bad cases at Level 3 is the highest, the points at which the error is greater than 2 account for more than 10% of all points at this level, and that with an error greater than 5 may still be found, indicating extreme error. Compared with Level 3, which has higher support, this is much better: there are more data points at Level 2, but the proportion of bad cases is relatively small, indicating that the text tends to have the effect of correcting extreme errors. In terms of comprehensive performance, the degree of support at Levels 4 and 5 is too low to be considered. The average error at Level 3 is the highest, while those at Levels 2 and 1 are successively smaller. The average error at Level 1 is only 0.55, a decrease of about 50%, compared to the highest of 1.02, indicating that when

$negativ{e}_{t}$ is in a larger or smaller position,

$negativ{e}_{t}$ is more conducive to oil price forecasting.

For the sentiment of neutral, the Levels 1 to 5 in

Figure 8 and

Table 7 correspond to the intervals [0, 0.2), [0.2, 0.4), [0.4, 0.6), [0.6, 0.8), and [0.8, 1], respectively. When

$neutra{l}_{t}$ is at Level 3, more bad cases will appear, that is, there will be many extreme error points: in contrast, they are rarer at Levels 2 and 4. According to the situation of extreme error points and the ratio of the data points with an error above the value specific to each level, the proportion of bad cases at Level 3 is the highest, the points at which the error is greater than 2 account for nearly 9% of all points at this level, and that with an error greater than 5 may still be found, indicating extreme error. Levels 2 and 4, with their higher support, are much better than Level 3, and the proportion of bad cases that appear is relatively small, indicating that text sentiment does have the effect of correcting extreme errors. In terms of the comprehensive performance, the degree of support at Levels 1 and 5 is too small to be considered, and the average error at Level 3 is the highest, while the average error at Levels 2 and 4 is lower than that at Levels 3 and 4 where the average error drops to nearly 0.4, indicating that when

$neutra{l}_{t}$ is in a larger or smaller position,

$neutra{l}_{t}$ is more conducive to oil price prediction.

For positive sentiment, Levels 1 to 5 in

Figure 8 and

Table 7 correspond to the intervals [0, 0.2), [0.2, 0.4), [0.4, 0.6), [0.6, 0.8), and [0.8, 1], respectively. When

$positiv{e}_{t}$ is at Level 3, more bad cases will appear, that is, there will be many extreme error points. In contrast, they are much rarer at Levels 2 and 4. By deeper analysis from the ratio of the data points with extreme error, the proportion of bad cases at Level 3 is the highest, the points at which the error is greater than 2 account for nearly 10% of all points at this level, and that with an error greater than 5 may still be found, indicating that the error degree is very high. Levels 2 and 4, with their higher support, are much better than Level 3, and the proportion of bad cases that appear is relatively small, indicating that the text sentiments do have the effect on correcting extreme errors. As for the comprehensive performance, the degree of support at Levels 1 and 5 is lower, and the average error at Level 3 is the highest, while that at Levels 2 and 4 is lower than that at Level 3, indicating that when

$positiv{e}_{t}$ is in a larger or smaller position,

$positiv{e}_{t}$ is more conducive to oil price forecasting.

The above analysis shows the sentiment of the four types of propensity to support oil price predictions under different propensity strengths, all of which exhibit very similar properties. Under strong tendencies (stronger or weaker), it is more conducive to suppressing bad cases. There is also a more obvious improvement in accuracy. On the contrary, if the sentiment is not obvious, it may affect the prediction of the original oil price. The average error of the result obtained by using both strong and weak tendencies to predict oil prices can be about 0.5. From the proportion of bad cases, the sentiment to be strong is conducive to correcting the result, and the number of bad cases can be reduced by about 20%, therefore, when using the text sentiment to predict oil prices, the strength of the sentiment can be considered, and the text tends to be corrected at a position where the sentiment is more obvious, so as to maximize the accuracy and stability of the oil price prediction.