Automatic Generation of Descriptive Titles for Video Clips Using Deep Learning

by   Soheyla Amirian, et al.

Over the last decade, the use of Deep Learning in many applications produced results that are comparable to and in some cases surpassing human expert performance. The application domains include diagnosing diseases, finance, agriculture, search engines, robot vision, and many others. In this paper, we are proposing an architecture that utilizes image/video captioning methods and Natural Language Processing systems to generate a title and a concise abstract for a video. Such a system can potentially be utilized in many application domains, including, the cinema industry, video search engines, security surveillance, video databases/warehouses, data centers, and others. The proposed system functions and operates as followed: it reads a video; representative image frames are identified and selected; the image frames are captioned; NLP is applied to all generated captions together with text summarization; and finally, a title and an abstract are generated for the video. All functions are performed automatically. Preliminary results are provided in this paper using publicly available datasets. This paper is not concerned about the efficiency of the system at the execution time. We hope to be able to address execution efficiency issues in our subsequent publications.


page 2

page 5

page 6

page 9


An Integrated Approach for Video Captioning and Applications

Physical computing infrastructure, data gathering, and algorithms have r...

Annotation Cleaning for the MSR-Video to Text Dataset

The video captioning task is to describe the video contents with natural...

Deep Natural Language Processing for LinkedIn Search Systems

Many search systems work with large amounts of natural language data, e....

Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video Summarization

Video summarization remains a huge challenge in computer vision due to t...

Use of Deep Learning in Modern Recommendation System: A Summary of Recent Works

With the exponential increase in the amount of digital information over ...

METEOR Guided Divergence for Video Captioning

Automatic video captioning aims for a holistic visual scene understandin...

EKO: Adaptive Sampling of Compressed Video Data

Researchers have presented systems for efficiently analysing video data ...

Please sign up or login with your details

Forgot password? Click here to reset