# Popularity Prediction of Online Contents via Cascade Graph and Temporal Information

^{*}

## Abstract

**:**

## 1. Introduction

- We incorporate the inter-infection duration time information into our model by using Long Short Term Memory (LSTM) network, and make up for the deficiencies of existing graph neural network based approaches.
- The experimental results on two publicly available real-world datasets show that our proposed method can significantly improve the cascade prediction accuracy compared to several state-of-the-art competitive baselines.

## 2. Background and Related Works

#### 2.1. Cascade Graph Representation

#### 2.2. Temporal Representation

## 3. Materials and Methods

#### 3.1. Problem Definition

#### 3.2. Methods

#### 3.2.1. Cascade Graph Representation

#### 3.2.2. Temporal Representation

#### 3.2.3. Predictor

## 4. Experiments and Results

#### 4.1. Datasets

- Sina Weibo: The dataset contained all microblogs posted on 1 June 2016, all their retweets and the corresponding retweet time within 24 h were recorded. The node in the cascade graph was the user who retweeted the microblog, and the edge between users represented their retweet relationship. Following previous works, we filtered out tweets posted in the midnight since they usually gained less attention due to less active users online. We also dropped microblogs whose retweet number was less than 10 or more than 1000 within the observation time window, because large cascades were rarely few in number and might have dominated the training process.
- HEP-PH: The dataset included paper citation relationship and paper publication time from January 1993 to April 2003. The node in the cascade graph represented the paper, and the edges referred to the corresponding citation relationship.

#### 4.2. Baselines

- DeepCas [39]: is an end-to-end deep learning method which extracts structural information of cascade graph by taking random walk in the context of global graph, and use bi-GRU neural network for the cascade size prediction task.
- DeepHawkes [6]: bridges the gap between deep learning and self-exciting point process by learning the cascade graph structural representation based on the level of propagation paths and takes time decay effect into consideration when integrating path representation into cascade representation.
- CasCN [7]: demonstrates the effectiveness in applying the graph neural network framework to generate the representation of cascade graph. It claims to exploit both the temporal and structural information by extracting cascade subgraphs from cascade graph and using LSTM neural network to model the dynamic change of cascade graphs.

#### 4.3. Variants

- VGraph (mean pool): We removed the temporal representation component from our model and only used the cascade graph representation alone. We also replaced the top-k pooling method with mean pooling method from the cascade graph representation component. The mean pooling method used the average of the embedding of all nodes in the cascade as the cascade graph embedding.
- VGraph: We removed the temporal representation component from our model and only used the cascade graph representation alone.
- VTemporal: We removed the cascade graph representation component and only used temporal representation component alone.

#### 4.4. Performance Comparison

#### 4.4.1. Model vs. Baselines

#### 4.4.2. Variants Comparison

#### 4.4.3. Latent Representation

## 5. Conclusions and Future Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Bond, R.M.; Fariss, C.J.; Jones, J.J.; Kramer, A.D.; Marlow, C.; Settle, J.E.; Fowler, J.H. A 61-million-person experiment in social influence and political mobilization. Nature
**2012**, 489, 295–298. [Google Scholar] [CrossRef][Green Version] - Wu, Q.; Gao, Y.; Gao, X.; Weng, P.; Chen, G. Dual sequential prediction models linking sequential recommendation and information dissemination. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 447–457. [Google Scholar]
- Leskovec, J.; Adamic, L.A.; Huberman, B.A. The Dynamics of Viral Marketing. ACM Trans. Web.
**2007**, 1, 5-es. [Google Scholar] [CrossRef][Green Version] - Cheng, J.; Adamic, L.A.; Dow, P.A.; Kleinberg, J.; Leskovec, J. Can cascades be predicted? In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 925–935. [Google Scholar]
- Shulman, B.; Sharma, A.; Cosley, D. Predictability of popularity: Gaps between prediction and understanding. In Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016, Cologne, Germany, 17–20 May 2016; pp. 348–357. [Google Scholar]
- Cao, Q.; Shen, H.; Cen, K.; Ouyang, W.; Cheng, X. DeepHawkes: Bridging the gap between prediction and understanding of information cascades. In Proceedings of the International Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; Volume Part F1318, pp. 1149–1158. [Google Scholar]
- Chen, X.; Zhou, F.; Zhang, K.; Trajcevski, G.; Zhong, T.; Zhang, F. Information diffusion prediction via recurrent cascades convolution. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; Volume 2019, pp. 770–781. [Google Scholar]
- Hamilton, W.L.; Ying, R.; Leskovec, J. Representation Learning on Graphs: Methods and Applications. arXiv
**2017**, arXiv:1709.05584. [Google Scholar] - Aljohani, N.R.; Fayoumi, A.; Hassan, S.U. Bot prediction on social networks of Twitter in altmetrics using deep graph convolutional networks. Soft Comput.
**2020**, 24, 11109–11120. [Google Scholar] [CrossRef] - Hopwood, M.; Pho, P.; Mantzaris, A.V. Exploring the Value of Nodes with Multicommunity Membership for Classification with Graph Convolutional Neural Networks. Information
**2021**, 12, 170. [Google Scholar] [CrossRef] - Gao, L.; Liu, Y.; Zhuang, H.; Wang, H.; Zhou, B.; Li, A. Public Opinion Early Warning Agent Model: A Deep Learning Cascade Virality Prediction Model Based on Multi-Feature Fusion. Front. Neurorobotics
**2021**, 15, 674322. [Google Scholar] [CrossRef] [PubMed] - Szabo, G.; Huberman, B.A. Predicting the Popularity of Online Content. Commun. ACM
**2010**, 53, 80–88. [Google Scholar] [CrossRef] - Huang, Z.; Wang, Z.; Zhang, R. Cascade2vec: Learning Dynamic Cascade Representation by Recurrent Graph Neural Networks. IEEE Access
**2019**, 7, 144800–144812. [Google Scholar] [CrossRef] - Zhou, F.; Xu, X.; Zhang, K.; Trajcevski, G.; Zhong, T. Variational Information Diffusion for Probabilistic Cascades Prediction. In Proceedings of the 2020-IEEE Conference on Computer Communications, Toronto, ON, Canada, 6–9 July 2020. [Google Scholar]
- Chen, X.; Zhang, K.; Zhou, F.; Trajcevski, G.; Zhong, T.; Zhang, F. Information cascades modeling via deep multi-task learning. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 885–888. [Google Scholar]
- Feng, X.; Zhao, Q.; Liu, Z. Prediction of Information Cascades via Content and Structure Integrated Whole Graph Embedding. Inf. Sci.
**2019**, 560, 424–440. [Google Scholar] [CrossRef] - Zhao, Q.; Erdogdu, M.A.; He, H.Y.; Rajaraman, A.; Leskovec, J. SEISMIC: A self-exciting point process model for predicting tweet popularity. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 1513–1522. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] - Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 1025–1035. [Google Scholar]
- Cangea, C.; Veličković, P.; Jovanović, N.; Kipf, T.; Liò, P. Towards Sparse Hierarchical Graph Classifiers. arXiv
**2018**, arXiv:1811.01287. [Google Scholar] - Knyazev, B.; Taylor, G.; Amer, M. Understanding Attention and Generalization in Graph Neural Networks. arXiv
**2019**, arXiv:1905.02850. [Google Scholar] - Weng, L.; Menczer, F.; Ahn, Y.Y. Predicting Successful Memes using Network and Community Structure. In Proceedings of the 8th International Conference on Weblogs and Social Media, Ann Arbor, MI, USA, 1–4 June 2014. [Google Scholar]
- Carta, S.; Podda, A.S.; Recupero, D.R.; Saia, R.; Usai, G. Popularity Prediction of Instagram Posts. Information
**2020**, 11, 453. [Google Scholar] [CrossRef] - Ma, C.; Yan, Z.; Chen, C.W. LARM: A Lifetime Aware Regression Model for Predicting YouTube Video Popularity. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; Association for Computing Machinery: New York, NY, USA, 2017; pp. 467–476. [Google Scholar]
- Kim, J.; Park, K.; Song, H.; Park, J.Y.; Cha, M. Learning How Spectator Reactions Affect Popularity on Twitch. In Proceedings of the 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), Busan, Korea, 19–22 February 2020; pp. 147–154. [Google Scholar]
- Tsur, O.; Rappoport, A. What’s in a Hashtag? Content based Prediction of the Spread of Ideas in Microblogging Communities. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, Seattle, WA, USA, 8–12 February 2012; pp. 643–652. [Google Scholar]
- Ma, Z.; Sun, A.; Cong, G. On predicting the popularity of newly emerging hashtags in Twitter. J. Am. Soc. Inf. Sci. Technol.
**2013**, 64, 1399–1410. [Google Scholar] [CrossRef] - Bao, P.; Shen, H.W.; Huang, J.; Cheng, X.Q. Popularity prediction in microblogging network: A case study on sina weibo. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 177–178. [Google Scholar]
- Pinto, H.; Almeida, J.M.; Gonçalves, M.A. Using early view patterns to predict the popularity of youtube videos. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, Rome, Italy, 4–8 February 2013; pp. 365–374. [Google Scholar]
- Newman, M. Spread of Epidemic Disease on Networks. Phys. Rev. Stat. Nonlinear Soft Matter Phys.
**2002**, 66, 016128. [Google Scholar] [CrossRef] [PubMed][Green Version] - Pastor-Satorras, R.; Vespignani, A. Epidemic spreading in scale-free networks. Phys. Rev. Lett.
**2001**, 86, 3200–3203. [Google Scholar] [CrossRef] [PubMed][Green Version] - Shen, H.; Wang, D.; Song, C.; Barabási, A.L. Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; pp. 291–297. [Google Scholar]
- Lin, S.; Kong, X.; Yu, P.S. Predicting trends in social networks via dynamic activeness model. In Proceedings of the International Conference on Information and Knowledge Management, San Francisco, CA, USA, 27 October–1 November 2013; pp. 1661–1666. [Google Scholar]
- Kong, Q.; Rizoiu, M.A.; Xie, L. Exploiting Uncertainty in Popularity Prediction of Information Diffusion Cascades Using Self-Exciting Point Processes; Cornell University: New York, NY, USA, 2020. [Google Scholar]
- Ma, T.; Zhou, H.; Tian, Y.; Al-Nabhan, N. A novel rumor detection algorithm based on entity recognition, sentence reconfiguration, and ordinary differential equation network. Neurocomputing
**2021**, 447, 224–234. [Google Scholar] [CrossRef] - Adewole, K.S.; Han, T.; Wu, W.; Song, H.; Sangaiah, A.K. Twitter spam account detection based on clustering and classification methods. J. Supercomput.
**2020**, 76, 4802–4837. [Google Scholar] [CrossRef] - Carta, S.; Corriga, A.; Mulas, R.; Recupero, D.; Saia, R. A Supervised Multi-class Multi-label Word Embeddings Approach for Toxic Comment Classification. In Proceedings of the 11th International Conference on Knowledge Discovery and Information Retrieval, Vienna, Austria, 17–19 September 2019. [Google Scholar]
- Cao, Q.; Shen, H.; Gao, J.; Wei, B.; Cheng, X. Popularity prediction on social platforms with coupled graph neural networks. In Proceedings of the WSDM 2020-Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 70–78. [Google Scholar]
- Li, C.; Ma, J.; Guo, X.; Mei, Q. DeepCas: An end-to-end predictor of information cascades. In Proceedings of the 26th International World Wide Web Conference, Perth, Australia, 3–7 April 2017; pp. 577–586. [Google Scholar]
- Liu, Y.; Bao, Z.; Zhang, Z.; Tang, D.; Xiong, F. Information cascades prediction with attention neural network. Hum. Centric Comput. Inf. Sci.
**2020**, 10, 1–16. [Google Scholar] [CrossRef] - Kipf, T.N.; Max, W. Semi-supervised classification with graph convolutional newtorks. Iclr
**2017**, 587, 1–14. [Google Scholar] - Veličković, P.; Casanova, A.; Liò, P.; Cucurull, G.; Romero, A.; Bengio, Y. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–12. [Google Scholar]
- Fey, M.; Lenssen, J.E. Fast Graph Representation Learning with PyTorch Geometric. ICLR 2019 (RLGM Workshop). 2019. Available online: https://rlgm.github.io/papers/2.pdf (accessed on 13 May 2021).
- Jiangli Shao, H.S. Temporal Convolutional Networks for Popularity Prediction of Messages on Social Medias. In China Conference on Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2019; Volume 4, p. 195. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res.
**2008**, 9, 2579–2605. [Google Scholar]

**Figure 2.**An overview of our proposed model. (

**a**) Cascade graph representation: converts cascade graph information into low dimensional representation, i.e., cascade graph representation; (

**b**) Temporal representation: learns temporal information based on LSTM neural network; (

**c**) Predictor: maps cascade graph representation and temporal representation to popularity.

**Figure 3.**An example of graph pooling process. Cascade graph representation ${h}_{g}$ is generated from node representation by graph pooling process.

**Figure 4.**Distribution of popularity: the X axis refers to the cascade size and the Y axis is the number of cascades corresponding to the cascade size.

**Figure 5.**Saturation ratio: the X axis refers to the time and the Y axis is the percentage of cascade size.

**Figure 6.**(

**a**) Mean inter-infection time (seconds) and its corresponding number of cascades; (

**b**) The distribution of large cascades over different mean inter-infection time (minutes); (

**c**) More than 90% large cascades with a mean inter-infection time less than 300 s.

**Figure 8.**Visualization of learned representation: (

**a**,

**c**,

**e**) show the cascade graph representations with observation time window set to be 1 h, 2 h, 3 h separately. (

**b**,

**d**,

**f**) show the temporal representations with observation time window set to be 1 h, 2 h, 3 h separately.

Datasets | Sina Weibo | HEP-PH | ||||
---|---|---|---|---|---|---|

T | 1 h | 2 h | 3 h | 3 years | 5 years | 7 years |

Number of cascades | 51,287 | 61,448 | 66,798 | 9409 | 10,629 | 10,983 |

Number of nodes | 1,740,500 | 2,190,604 | 2,431,607 | 25,973 | 27,566 | 28,051 |

Number of edges | 3,404,975 | 4,454,060 | 5,028,177 | 189,590 | 255,159 | 284,016 |

Average cascade size | 66.39 | 72.49 | 75.27 | 20.15 | 24.01 | 25.86 |

Datasets | Weibo Dataset | HEP-PH | ||||
---|---|---|---|---|---|---|

Metric | MSLE | |||||

T | 1 h | 2 h | 3 h | 3 years | 5 years | 7 years |

DeepCas | 2.958 | 2.689 | 2.647 | 1.765 | 1.538 | 1.462 |

DeepHawkes | 2.441 | 2.287 | 2.252 | 1.581 | 1.470 | 1.233 |

CasCN | 2.242 | 2.036 | 1.910 | 1.353 | 1.164 | 0.851 |

Proposed | 1.931 | 1.813 | 1.770 | 1.251 | 1.147 | 0.673 |

Datasets | Weibo Dataset | ||
---|---|---|---|

T | 1 h | 2 h | 3 h |

Baseline | |||

DeepCas | 2.958 | 2.689 | 2.647 |

DeepHawkes | 2.441 | 2.287 | 2.252 |

CasCN | 2.242 | 2.036 | 1.910 |

Variants | |||

VGraph (mean pool) | 2.379 | 2.286 | 2.207 |

VGraph | 2.360 | 2.231 | 2.164 |

VTemporal | 2.011 | 1.843 | 1.798 |

Proposed | 1.931 | 1.813 | 1.770 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Shang, Y.; Zhou, B.; Wang, Y.; Li, A.; Chen, K.; Song, Y.; Lin, C. Popularity Prediction of Online Contents via Cascade Graph and Temporal Information. *Axioms* **2021**, *10*, 159.
https://doi.org/10.3390/axioms10030159

**AMA Style**

Shang Y, Zhou B, Wang Y, Li A, Chen K, Song Y, Lin C. Popularity Prediction of Online Contents via Cascade Graph and Temporal Information. *Axioms*. 2021; 10(3):159.
https://doi.org/10.3390/axioms10030159

**Chicago/Turabian Style**

Shang, Yingdan, Bin Zhou, Ye Wang, Aiping Li, Kai Chen, Yichen Song, and Changjian Lin. 2021. "Popularity Prediction of Online Contents via Cascade Graph and Temporal Information" *Axioms* 10, no. 3: 159.
https://doi.org/10.3390/axioms10030159