The Role of Information Complexity and Randomization in Representation Learning

Matias Vera; Pablo Piantanida; Leonardo Rey Vega

Pré-Publication, Document De Travail Année : 2018

The Role of Information Complexity and Randomization in Representation Learning

(1) , (2) , (1)

1
2

Matias Vera

Fonction : Auteur

Consejo Nacional de Investigaciones Científicas y Técnicas [Buenos Aires]

Pablo Piantanida

Fonction : Auteur
PersonId : 736967
IdHAL : pablo-piantanida
ORCID : 0000-0002-8717-2117

Laboratoire des signaux et systèmes

Leonardo Rey Vega

Fonction : Auteur

Consejo Nacional de Investigaciones Científicas y Técnicas [Buenos Aires]

Résumé

A grand challenge in representation learning is to learn the different explanatory factors of variation behind the high dimen- sional data. Encoder models are often determined to optimize performance on training data when the real objective is to generalize well to unseen data. Although there is enough numerical evidence suggesting that noise injection (during training) at the representation level might improve the generalization ability of encoders, an information-theoretic understanding of this principle remains elusive. This paper presents a sample-dependent bound on the generalization gap of the cross-entropy loss that scales with the information complexity (IC) of the representations, meaning the mutual information between inputs and their representations. The IC is empirically investigated for standard multi-layer neural networks with SGD on MNIST and CIFAR-10 datasets; the behaviour of the gap and the IC appear to be in direct correlation, suggesting that SGD selects encoders to implicitly minimize the IC. We specialize the IC to study the role of Dropout on the generalization capacity of deep encoders which is shown to be directly related to the encoder capacity, being a measure of the distinguishability among samples from their representations. Our results support some recent regularization methods.

Domaines

Théorie de l'information [cs.IT] Réseaux et télécommunications [cs.NI] Théorie de l'information et codage [math.IT] Statistiques [math.ST]

Pablo Piantanida : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-01742442

Soumis le : samedi 24 mars 2018-23:42:59

Dernière modification le : lundi 18 mars 2024-03:05:32

Dates et versions

hal-01742442 , version 1 (24-03-2018)

Identifiants

HAL Id : hal-01742442 , version 1
ARXIV : 1802.05355

Citer

Matias Vera, Pablo Piantanida, Leonardo Rey Vega. The Role of Information Complexity and Randomization in Representation Learning. 2018. ⟨hal-01742442⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS SUP_LSS SUP_TELECOMS CENTRALESUPELEC UNIV-PARIS-SACLAY GS-ENGINEERING GS-COMPUTER-SCIENCE

59 Consultations

0 Téléchargements

The Role of Information Complexity and Randomization in Representation Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager