Image colorization based on transformer with sparse attention

Yongsheng Zhang; Siyu Wu

doi:10.1117/12.3021490

19 February 2024 Image colorization based on transformer with sparse attention

Yongsheng Zhang, Siyu Wu

Proceedings Volume 13063, Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023); 130630J (2024) https://doi.org/10.1117/12.3021490
Event: Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023), 2023, Changchun, China

Abstract

Colorizing grayscale images automatically has consistently posed a formidable challenge, and in recent years approaches using deep neural networks have become mainstream techniques. However, the results of colorizing these images remain unsatisfactory, ignoring color richness and structural consistency. As a result, we propose an image coloring method based on the combination of a sparse attention mechanism network and a color distribution predictor. Firstly, the color distribution predictor uses anchors to predict the color distribution of different regions or pixels in the image, so that the model can better understand the color relationship of different regions in the image, and make the coloring results more natural and consistent with the real-world color distribution. The sparse attention-based Transformer network is then used to generate a low-resolution coarse coloring by reference to the sampled anchor color first, before upsampling it to a high-resolution image. Sparse attention not only accelerates the training and inference process of the Transformer model, but also improves the coloring quality as well as preserves image details. The results show that our method achieves significant superiority, reducing computational complexity, improving efficiency, and producing more realistic color images with better coloring results.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Yongsheng Zhang and Siyu Wu "Image colorization based on transformer with sparse attention", Proc. SPIE 13063, Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023), 130630J (19 February 2024); https://doi.org/10.1117/12.3021490

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available