Paper
21 October 2024 CLIP-optimized prompt for zero-shot wild animal identification
Yifei Shao, Pengfei Li, Lian Yu, Shiyue Wang, Runqi Ai
Author Affiliations +
Proceedings Volume 13399, Ninth International Workshop on Pattern Recognition; 1339902 (2024) https://doi.org/10.1117/12.3052992
Event: Ninth International Workshop on Pattern Recognition, 2024, Xiamen, China
Abstract
Large pre-trained models like CLIP, Llama, and GPT have been trained on vast datasets and possess tens of millions to billions of parameters. Compared to traditional models, these models delve deeply into the rich knowledge embedded in data, demonstrating exceptional generality and generalization capabilities. For instance, the CLIP model has been trained on 400 million image-text pairs, acquiring substantial multimodal abstract knowledge and robust representational abilities[1]. However, in specific downstream tasks, traditional models may suffer severely from the complexity of the task scenario, limitations in computational resources, and poor data quality, among other issues. To address these challenges, we propose a method named CLIP-ZSWAI. This method does not require additional training of the model but instead achieves zero-shot transfer of the CLIP model on specific tasks through optimizing prompt information[2]. This approach leverages CLIP’s multimodal knowledge to mitigate the performance deficiencies of smaller models due to data scarcity. CLIP-ZSWAI inherits the rich abstract knowledge learned by CLIP during training and adjusts classifier weights by optimizing text prompts, thereby achieving efficient transfer on downstream tasks. We have validated the effectiveness of our method on the Animal90, Animal19, and our own collected AnimalQH datasets, demonstrating its superiority over other methods.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yifei Shao, Pengfei Li, Lian Yu, Shiyue Wang, and Runqi Ai "CLIP-optimized prompt for zero-shot wild animal identification", Proc. SPIE 13399, Ninth International Workshop on Pattern Recognition, 1339902 (21 October 2024); https://doi.org/10.1117/12.3052992
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Animal model studies

Data modeling

Performance modeling

Visual process modeling

Education and training

Animals

Mathematical optimization

RELATED CONTENT


Back to Top