Abstract
Due to recent advances in the film industry, the production of movies has grown exponentially, which has led to challenges in what is referred to as discoverability: given the overwhelming number of choices, choosing which film to watch has become a tedious task for audiences. Movie summarization (MS) could help, as it presents the central theme of the movie in a compact format and makes browsing more efficient for the audience. In this paper, we present an automatic MS framework coined as ‘QuickLook’, which identifies the leading characters and fuses multiple cues extracted from a movie. Firstly, the movie data is preprocessed for its division into scenes, followed by shot segmentation. Secondly, the leading characters in each segmented scene are determined. Next, four visual cues that capture the film's scenic beauty, memorability, informativeness and emotional resonance are extracted from shots containing the leading characters. These extracted features are then intelligently fused based on the assignment of different weights; shots with a fusion score above a certain threshold are selected for the final summary. The proposed MS framework is assessed by comparison with official trailers from ten Hollywood movies, providing a novel baseline for future fair comparison in the MS literature. The proposed framework is shown to outperform other state-of-the-art MS methods in terms of enjoyability and informativeness.
Original language | English |
---|---|
Pages (from-to) | 24-35 |
Number of pages | 12 |
Journal | Information Fusion |
Volume | 76 |
DOIs | |
Publication status | Published - Dec 2021 |
Keywords
- Deep learning
- Facial expressions
- Information fusion
- Movie analysis
- Psychological cues extraction
- Video summarization