A Study on Support System for Understanding Technical Documents Based on Annotations of Video Scenes and Technical Documents
Understanding technical documents as many as possible is important to make our research activities much better. When we read technical documents, we understand their research contents more easily by referring related resources such as images, audios and videos. Especially, videos contain various kinds of information that are human motions and system demonstrations. They are difficult to be represented only by words in technical documents. Therefore, related videos facilitate our understanding of technical documents. However, since, in general, both technical documents and videos contain a lot of semantic information, it is difficult to find out related video scenes to some parts of technical documents. In this paper, we propose some methods to define video scenes, document elements and their relationships. We developed a system to browse documents and videos simultaneously based on relationships between videos and documents.
First, we developed a mechanism to segment and annotate video scenes and elements of technical documents with semantic information. Video scenes are structured by start and end time stamps, scene title, and comments. Elements of technical documents are structured with information of regions in the documents, text data such as transcripts, comments, and translations. Next, we developed a mechanism for connecting video scenes to elements of technical documents. Based on these annotations and connections, we developed a support system for understanding technical documents with videos. It is called Docvie (Document with Movie). Docvie has the following two view modes: video-featured mode in which we can watch a video with related information including documents and other videos, and document-featured mode in which we can read a technical document with related information including videos and other documents. By flexibly changing the modes, we can browse many technical documents and videos efficiently.
We performed some subjective experiments to confirm a usability of the system. First, each subject watched a certain video scene, and answered questions about research topics described in the video scene by using Docvie and the system which has a function to read technical documents and their related information. As a result, using Docvie, each subject could answer more questions at shorter time than using the other system. Some questionnaires to subjects showed that usability of Docvie was appropriate and browsing technical documents and related video scenes was very effective to understand the contents.