Multi-task attribute-fusion model for fine-grained image recognition

Mengze Li; Ming Kong; Kun Kuang; Qiang Zhu; Fei Wu

doi:10.1117/12.2573311

10 October 2020 Multi-task attribute-fusion model for fine-grained image recognition

Mengze Li, Ming Kong, Kun Kuang, Qiang Zhu, Fei Wu

Proceedings Volume 11550, Optoelectronic Imaging and Multimedia Technology VII; 115500J (2020) https://doi.org/10.1117/12.2573311
Event: SPIE/COS Photonics Asia, 2020, Online Only

Abstract

Attribute information in fine-grained image recognition often provides more accurate and rich information related to categories. How to effectively combine such knowledge to guide image classification tasks has been one of the research hotspots in computer vision in recent years. We believe that using the association relationship between attributes to fuse attribute information can obtain a more accurate representation of the image. In this paper, we propose a novel Multi-Task Attribute Fusion Model (MTAF) which makes two major improvements to the traditional multi-task learning framework: 1) Attribute-Aware Feature Discrimination: combine the spatial attention and the channel attention mechanism to enhance the feature map of the CNN, so that attribute can be associated to important positions and important channels of the image; 2) Transformer-Based Feature Fusion: introduce the Transformer model to better learn the logical association between attributes, so that the reconstructed features are able to achieve a best classification performance. We have verified our algorithm on two datasets, one is the own-collected medical dataset for thyroid benign and malignant identification, and the other is an open dataset widely used for fine-grained image recognition. Experimental results on both datasets demonstrate that the proposed method can achieve higher classification accuracy than baselines.

Conference Presentation

Citation Download Citation

Mengze Li, Ming Kong, Kun Kuang, Qiang Zhu, and Fei Wu "Multi-task attribute-fusion model for fine-grained image recognition", Proc. SPIE 11550, Optoelectronic Imaging and Multimedia Technology VII, 115500J (10 October 2020); https://doi.org/10.1117/12.2573311

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available