A Study on Video Scene Quotation and its Application to Video Scene Retrieval

Tomoki MASUDA
Department of Media Science, Graduate School of Information Science, Nagoya University

In this paper, we propose an efficient mechanism for retrieving video scenes.The mechanism uses video annotations extracted from user activities to quotevideo scenes and to create video scene playlists.

In first, we have developed a user interface that users can easily quotean arbitrary time segment of video contents.Our quotation system allows the users to pick up multiple video scenes frommultiple video contents.In addition, the system offers mechanisms for browsing multiple videoscenes synchronously on Weblogs by quoting multiple video scenes in the same paragraph.When the users quote video scenes, their segments are created andtexts written in paragraphs are asscociated to the video scenes, and video scenes are associated to the other video scenes.

To extract other relations among video scenes, we have also developed a system that enable users to create video sceneplaylists by using quoted scenes. The playlists are classified based on their creation method, and their representative keywords extracted based on theannotation contents are associated with the video scenes. The relationships among video scenes are analyzed to integrate several kinds of relations extracted two different methods for video sceneannotation.Based on this, our system calculates relevance values between videoscenes and keywords.

Furthermore, we have developed a tag-based video scene retrival system. The output of our video scene retrieval is a list of videos each havinga collection of keywords, called a tag-cloud, associated with the video and a timeline seek bar on which quotedtime segments are highlighted and associated text comments are displayed. These information let users overview and understand video content, and,as a result, video scenes that users want to watch are found. Our developed retrieval algorithm makes precision and recall beincreased based on the relationships among video scenes.

Finally, we have performed the subjective experiment to evaluate ourproposal mechanism by collecting video annotations and retrieving videoscenes. We have retrieved video scenes by using both video annotations extractedfrom the conventional mechanism and extracted from the proposal mechanism. As a result, recall, F-measure, and top-ranked precision are improved byapplying our proposal mechanism. Especially, recall is improved from 53.7% to 88.8%. From the result, the usefulness of our proposed mechanism for retrievingvideo scenes was confirmed.

1

2

2.1

2.2

2.3

2.3.1

Fugure1:

2.3.2

2.3.3

2.3.4

2.4

2.5

Fugure2:

Fugure3:

3

3.1

3.1.1

3.1.1.1

Fugure4:

3.1.1.2

Fugure5:

3.1.1.3

Fugure6:

3.1.1.4

Fugure7:

3.1.2

3.1.3

Fugure8:

3.1.4

Fugure9:

3.1.5

3.2

3.2.1

Fugure10:

3.2.2

Fugure11:

3.2.3

3.3

3.3.1

3.3.2

3.3.3

4

4.1

Fugure13:

4.1.1

Fugure14:

4.2

4.2.1

4.2.2

4.2.3

Fugure15:

4.3

Fugure16:

Fugure17:

4.4

4.5

5

5.1

5.1.1

5.1.2

5.1.2.1

5.1.2.2

Fugure19:

5.1.2.3

5.2

5.2.1

5.2.2

5.2.2.1

Fugure20:

5.2.2.2

Fugure21:

5.2.3

6

6.1

6.2

7

7.1

7.2

7.2.1

7.2.2

7.2.3

7.2.4

7.2.5

7.2.6

$R(s) = c \sum_{ u \in B_s }^{} \frac{R(u)W(s,u)}{ \sum_{ t \in B_u }^{} W(t,u)}$
(1)