Transductive Zero-Shot and Few-Shot CLIP - Centre pour l'Apprentissage Statistique et la Vision Numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2024

Transductive Zero-Shot and Few-Shot CLIP

Résumé

Transductive inference has been widely investigated in few-shot image classification, but completely overlooked in the recent, fast growing literature on adapting vision-langage models like CLIP. This paper addresses the transductive zero-shot and few-shot CLIP classification challenge, in which inference is performed jointly across a mini-batch of unlabeled query samples, rather than treating each instance independently. We initially construct informative vision-text probability features, leading to a classification problem on the unit simplex set. Inspired by Expectation-Maximization (EM), our optimization-based classification objective models the data probability distribution for each class using a Dirichlet law. The minimization problem is then tackled with a novel block Majorization-Minimization algorithm, which simultaneously estimates the distribution parameters and class assignments. Extensive numerical experiments on 11 datasets underscore the benefits and efficacy of our batch inference approach. On zero-shot tasks with test batches of 75 samples, our approach yields near 20% improvement in ImageNet accuracy over CLIP's zero-shot performance. Additionally, we outperform state-of-the-art methods in the few-shot setting. The code is available at: https://github.com/SegoleneMartin/transductive-CLIP.
Fichier principal
Vignette du fichier
main.pdf (45.22 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04534868 , version 1 (05-04-2024)

Licence

Paternité

Identifiants

  • HAL Id : hal-04534868 , version 1

Citer

Ségolène Martin, Yunshi Huang, Fereshteh Shakeri, Jean-Christophe Pesquet, Ismail Ben Ayed. Transductive Zero-Shot and Few-Shot CLIP. CVPR 2024 - IEEE Conference on Computer Vision and Pattern Recognition, Jun 2024, Seattle, Washington, United States. ⟨hal-04534868⟩
2 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More