In this paper, we propose a novel approach for real-time face detection in HD videos. The major goal is to detect small faces at a distance for HD videos as well as to preserve low false-alarm rate. The proposed method firstly fast scan face candidates in the down-sampled chrominance channels by using skin color features. Given the candidates, we precisely filter true facial objects from the candidates which location and size are subjected to be mapped into the original luminance channel by using the binary texture features. In our experiment, we showed that the proposed face detection method archived about 20ms/frame in HD videos. It means that the proposed method would be more than 30 times faster than the baseline Viola-Jones one in processing time for 720P-sized video. The gain of processing time will be higher for larger-sized video such as 1080P. And we also see that the false-alarm rate was extremely reduced as about 1/20 of the baseline approach.
Proc. SPIE. 6821, Multimedia on Mobile Devices 2008
KEYWORDS: Mobile devices, Internet, Statistical analysis, Personal digital assistants, Telecommunications, Zoom lenses, Multimedia, Statistical modeling, Laminated object manufacturing, Internet technology
As the Internet and multimedia technology is becoming advanced, the number of digital multimedia contents is also
becoming abundant in learning area. In order to facilitate the access of digital knowledge and to meet the need of a
lifelong learning, e-learning could be the helpful alternative way to the conventional learning paradigms. E-learning is
known as a unifying term to express online, web-based and technology-delivered learning. Mobile-learning (m-learning)
is defined as e-learning through mobile devices using wireless transmission. In a survey, more than half of the people
remarked that the re-consumption was one of the convenient features in e-learning. However, it is not easy to find user's
preferred segmentation from a full version of lengthy e-learning content. Especially in m-learning, a content-summarization
method is strongly required because mobile devices are limited to low processing power and battery
capacity. In this paper, we propose a new user preference model for re-consumption to construct personalized
summarization for re-consumption. The user preference for re-consumption is modeled based on user actions with
statistical model. Based on the user preference model for re-consumption with personalized user actions, our method
discriminates preferred parts over the entire content. Experimental results demonstrated successful personalized
An image content adaptation for visually impaired people based on the MPEG-21 Digital Item Adaptation (DIA)
standard is proposed. The content adaptation mainly considers spatial contrast vision characteristic of users, which is
represented by a contrast sensitivity function (CSF). There are three key contributions of the paper. First, the visual
perception of users who have different spatial contrast vision abilities is simulated by incorporating the HVS model
proposed by Pattanaik et al. Second, to measure spatial contrast vision, and thus realizing personalized content
adaptation depending on the severity of the visual ability of individual user, CSF is measured on computer-based
environment. The measured spatial contrast vision symptom and its severity, is represented in an interoperable way by
using an example of extended description tool provided by the MPEG-21 DIA specification. Third, the content
adaption is also proposed, which is personalized in a sense that the adapted content would be optimized to the given
description of a particular symptom and its severity. To assess the effectiveness of the proposed methods, we performed
a number of experiments targeting users with a low vision and showed how to determine and describe the CSF
parameters. Furthermore, statistical experiment is performed to verify the effectiveness of the proposed adaptation
process for users with the low vision symptom.
In this paper, we propose a new promising photo album application format, which enables augmented use of digital
home photos over a wide range of mobile devices and semantic photo consumption as minimizing user's manual tasks.
The photo album application format packages photo collection and associated metadata based on MPEG-4 file format.
The schema of the album metadata is designed in two levels: collection- and item-level descriptions. The collection-level
description is metadata related to group of photos, each of which has item-level description that contains its
detailed information. To demonstrate the use of the proposed album format on mobile devices, a photo album system
was also developed, which could realize semantic photo consumption in sense of situation, category, and person.
In this paper, we propose a method to cluster digital home photos associated with user-centric functionalities, which are event/situation based photo clustering, category based photo clustering, and person-identity based photo clustering and indexing. The main idea of the user-centric photo album is to enable users to organize and browse their photos along the semantically meaningful axes that are situation, category, and person-identity. Experiment results showed that the proposed method would be useful to organize a photo album based on human perception.
In this paper, we propose automatic situation clustering method for digital photo album. A group of photos having the same situation could have similar visual semantics. In this paper, visual semantic hints of photo are proposed and used to cluster situations. Experiments were performed with 2345 photos and results showed that the proposed clustering with the visual semantic hints was useful for automated situation clustering based on human perception.
In this paper, we propose content adaptation for visual impairments in MPEG-21. The proposed content adaptation aims
to give enhanced visual accessibility to users with visual impairment in MPEG-21. In this paper, we consider two major
visual impairments: low vision impairment and color vision deficiency. The proposed method includes description for
the visual impairments and content adaptation technique based on it. We have developed a symptom-based description
of visual impairment characteristics for users with visual impairment in the context of MPEG-21 digital item adaptation
(DIA). To verify usefulness of the proposed method, we performed some experiments with the content adaptation based
on the description in MPEG-21. The experiment results showed that the proposed method is effective content adaptation
for user with visual impairment and gives enhanced visual accessibility to them.
As color is more widely used to carry visual information in the multimedia content, ability to perceive color plays a crucial role in getting visual information. Regardless of color vision variations, one should have visual information equally. This paper proposes the adaptation technique for color vision variations in the MPEG-21 Digital Item Adaptation (DIA). DIA is performed respectively for severe color vision deficiency (dichromats) and for mild color vision deficiency (anomalous trichromats), according to the description of user characteristics about color vision variations. Adapted images are tested by simulation program for color vision variations so as to recognize the appearance of the adapted images in the color deficient vision. Experimental result shows that proposed adaptation technique works well in the MPEG-21 framework.
We propose an automatic image categorization technique for content-based image filtering and retrieval system. In this paper, category-feature database for image categorization is constructed on human visual perception. Query images are automatically classified into predefined categories by content-based description using MPEG-7. Similarity distances at each category are measured using multiple MPEG-7 descriptors. In this paper, a matching technique for combining multiple similarity distances is proposed. The proposed method takes into account the categorization performance of single descriptor at each category. To evaluate the proposed method, it is applied to a great number of query images randomly collected from the Internet and other image databases.
The increased availability and usage of digital video lead to a need for automated video content analysis techniques. Most research on digital video content analysis includes automatic detection of the shot boundaries. However, those methods are not efficient in terms of computational time. In this paper, we propose the digital video camera system that can provide real-time shot boundary detection using the MPEG-7 descriptor. The video camera system is built so that MPEG-7 descriptors are extracted from frames of video. In this paper, the shot boundaries are achieved by measuring a distance of MPEG-7 descriptors for consecutive frames in real-time. Experimental results showed that the proposed video camera system provides fast and effective real-time shot boundary detection.
We propose an image filtering technique for information filtering agent system. In this paper, contents based image filtering technique is proposed. In the proposed technique, content description of MPEG-7 is adopted into the image filtering. To verify the usefulness of the proposed method, an image-filtering agent system is developed on the network layer. MPEG-7 texture and color descriptors were employed as a content description. And MPEG-7 encoding the descriptors was performed just after receiving all packets of image data. Experimental result shows that the similarity-filtering ratio of the proposed method is much higher than that of conventional method without any cost of network speed.