Information Bottleneck and Representation Learning

Pablo Piantanida; Leonardo Rey Vega

doi:10.1017/9781108616799.012

Chapitre D'ouvrage Année : 2021

Information Bottleneck and Representation Learning

(1) , (2)

1
2

Pablo Piantanida

Fonction : Auteur
PersonId : 736967
IdHAL : pablo-piantanida
ORCID : 0000-0002-8717-2117

Laboratoire des signaux et systèmes

Leonardo Rey Vega

Fonction : Auteur

Consejo Nacional de Investigaciones Científicas y Técnicas [Buenos Aires]

Résumé

A grand challenge in representation learning is the development of computational algorithms that learn the different explanatory factors of variation behind high-dimensional data. Representation models (usually referred to as encoders) are often determined for optimizing performance on training data when the real objective is to generalize well to other (unseen) data. The first part of this chapter is devoted to provide an overview of and introduction to fundamental concepts in statistical learning theory and the Information Bottleneck principle. It serves as a mathematical basis for the technical results given in the second part, in which an upper bound to the generalization gap corresponding to the cross-entropy risk is given. When this penalty term times a suitable multiplier and the cross entropy empirical risk are minimized jointly, the problem is equivalent to optimizing the Information Bottleneck objective with respect to the empirical data distribution. This result provides an interesting connection between mutual information and generalization, and helps to explain why noise injection during the training phase can improve the generalization ability of encoder models and enforce invariances in the resulting representations.

Domaines

Théorie de l'information [cs.IT] Réseaux et télécommunications [cs.NI] Théorie de l'information et codage [math.IT] Statistiques [math.ST]

Fichier principal

book.pdf (1.21 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Pablo Piantanida : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-01742456

Soumis le : jeudi 22 juin 2023-02:07:25

Dernière modification le : lundi 18 mars 2024-03:17:52

Dates et versions

hal-01742456 , version 1 (19-01-2022)

hal-01742456 , version 2 (22-06-2023)

Identifiants

HAL Id : hal-01742456 , version 2
DOI : 10.1017/9781108616799.012

Citer

Pablo Piantanida, Leonardo Rey Vega. Information Bottleneck and Representation Learning. Cambridge University Press. Information-Theoretic Methods in Data Science, pp.330-358, 2021, ⟨10.1017/9781108616799.012⟩. ⟨hal-01742456v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS SUP_LSS SUP_TELECOMS CENTRALESUPELEC UNIV-PARIS-SACLAY GS-COMPUTER-SCIENCE GS-SPORT-HUMAN-MOVEMENT HUB-IA

195 Consultations

155 Téléchargements

Information Bottleneck and Representation Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager