Discriminative feature representation for image classification via multimodal multitask deep neural networks

Shuang Mei; Hua Yang; Zhouping Yin

doi:10.1117/1.JEI.26.1.013023

24 February 2017 Discriminative feature representation for image classification via multimodal multitask deep neural networks

Shuang Mei, Hua Yang, Zhouping Yin

Author Affiliations +

Journal of Electronic Imaging, Vol. 26, Issue 1, 013023 (February 2017). https://doi.org/10.1117/1.JEI.26.1.013023

Abstract

A good image feature representation is crucial for image classification tasks. Many traditional applications have attempted to design single-modal features for image classification; however, these may have difficulty extracting sufficient information, resulting in misjudgments for various categories. Recently, researchers have focused on designing multimodal features, which have been successfully employed in many situations. However, there are still some problems in this research area, including selecting efficient features for each modality, transforming them to the subspace feature domain, and removing the heterogeneities among different modalities. We propose an end-to-end multimodal deep neural network (MDNN) framework to automate the feature selection and transformation procedures for image classification. Furthermore, inspired by Fisher’s theory of linear discriminant analysis, we improve the proposed MDNN by further proposing a multimodal multitask deep neural network (M2DNN) model. The motivation behind M2DNN is to improve the classification performance by incorporating an auxiliary discriminative constraint to the subspace representation. Experimental results on five representative datasets (NUS-WIDE, Scene-15, Texture-25, Indoor-67, and Caltech-101) demonstrate the effectiveness of the proposed MDNN and M2DNN models. In addition, experimental comparisons of the Fisher score criterion exhibit that M2DNN is more robust and has better discriminative power than other approaches.

Citation Download Citation

Shuang Mei, Hua Yang, and Zhouping Yin "Discriminative feature representation for image classification via multimodal multitask deep neural networks," Journal of Electronic Imaging 26(1), 013023 (24 February 2017). https://doi.org/10.1117/1.JEI.26.1.013023

Received: 10 September 2016; Accepted: 24 January 2017; Published: 24 February 2017

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available