KEYWORDS: Detection and tracking algorithms, Visualization, Cameras, Video, Control systems, Software development, Optical resolution, Information technology, Information science, Information visualization
In this paper, we describe an indexing method for video lifelog using sounds generated by human actions. The miniaturization of information-processing devices has enabled us to constantly record experience movies recently. However, these movies include many useless parts for a user. It is important to automatically extract only useful experiences from whole records. If a tool to easily add indices to important experiences is provided, users can mark these ones. Even though it is easy to add indices with some device, considering users' loads, it is undesirable for users to wear devices other than a microphone and a video camera that are needed to record experiences.
Therefore, we propose a method that users can add indices using sounds which can be generated by using a part of users' body. We especially analyzed typical sounds like hand clapping and finger clicking sounds that users can generate themselves. A detection method of two index-sounds was developed. We performed an experiment to confirm the recall ratio and relevance ratio of two index-sounds.
A wearable system was worn by a subject and experiences were recorded for ten days (about 100 hours). The proposed detection method was applied to recorded data, and two index-sounds were detected with the recall ratio 86.0% and the relevance ratio 83.6%.