Next Article in Journal
The Use of Chalcedonite as a Biosorption Bed in the Treatment of Groundwater
Previous Article in Journal
Accurate Evaluation of the Average Probability of Error of Pulse Position Modulation in Amplified Optical Wireless Communications under Turbulence
Open AccessArticle

Deep Reinforcement Learning for Query-Conditioned Video Summarization

1
Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
2
University of Chinese Academy of Sciences, Beijing 100190, China
3
Machine Learning Group, UiT The Arctic University of Norway, Tromsø 9019, Norway
*
Author to whom correspondence should be addressed.
Appl. Sci. 2019, 9(4), 750; https://doi.org/10.3390/app9040750
Received: 2 January 2019 / Revised: 29 January 2019 / Accepted: 18 February 2019 / Published: 21 February 2019
(This article belongs to the Section Computing and Artificial Intelligence)
Query-conditioned video summarization requires to (1) find a diverse set of video shots/frames that are representative for the whole video, and that (2) the selected shots/frames are related to a given query. Thus it can be tailored to different user interests leading to a better personalized summary and differs from the generic video summarization which only focuses on video content. Our work targets this query-conditioned video summarization task, by first proposing a Mapping Network (MapNet) in order to express how related a shot is to a given query. MapNet helps establish the relation between the two different modalities (videos and query), which allows mapping of visual information to query space. After that, a deep reinforcement learning-based summarization network (SummNet) is developed to provide personalized summaries by integrating relatedness, representativeness and diversity rewards. These rewards jointly guide the agent to select the most representative and diversity video shots that are most related to the user query. Experimental results on a query-conditioned video summarization benchmark demonstrate the effectiveness of our proposed method, indicating the usefulness of the proposed mapping mechanism as well as the reinforcement learning approach. View Full-Text
Keywords: query-conditioned video summarization; deep reinforcement learning; visual-text embedding; temporal modeling; vision application query-conditioned video summarization; deep reinforcement learning; visual-text embedding; temporal modeling; vision application
Show Figures

Figure 1

MDPI and ACS Style

Zhang, Y.; Kampffmeyer, M.; Zhao, X.; Tan, M. Deep Reinforcement Learning for Query-Conditioned Video Summarization. Appl. Sci. 2019, 9, 750.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop