Learning Disentangled Textual Representations via Statistical Measures of Similarity

Pierre Colombo; Guillaume Staerman; Nathan Noiry; Pablo Piantanida

doi:10.18653/v1/2022.acl-long.187

Conference Papers Year : 2022

Learning Disentangled Textual Representations via Statistical Measures of Similarity

(1) , (2, 3) , (2, 3) , (4)

1
2
3
4

Pierre Colombo

Function : Correspondent author
PersonId : 743646
IdHAL : pierre-colombo

Connectez-vous pour contacter l'auteur

Laboratoire des signaux et systèmes

Guillaume Staerman

Function : Author
PersonId : 743604
IdHAL : guillaume-staerman

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Nathan Noiry

Function : Author
PersonId : 1110337

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Pablo Piantanida

Function : Author
PersonId : 736967
IdHAL : pablo-piantanida
ORCID : 0000-0002-8717-2117

International Laboratory on Learning Systems

Abstract

When working with textual data, a natural application of disentangled representations is the fair classification where the goal is to make predictions without being biased (or influenced) by sensible attributes that may be present in the data (e.g., age, gender or race). Dominant approaches to disentangle a sensitive attribute from textual representations rely on learning simultaneously a penalization term that involves either an adversary loss (e.g., a discriminator) or an information measure (e.g., mutual information). However, these methods require the training of a deep neural network with several parameter updates for each update of the representation model. As a matter of fact, the resulting nested optimization loop is both times consuming, adding complexity to the optimization dynamic, and requires a fine hyperparameter selection (e.g., learning rates, architecture). In this work, we introduce a family of regularizers for learning disentangled representations that do not require training. These regularizers are based on statistical measures of similarity between the conditional probability distributions with respect to the sensible attributes. Our novel regularizers do not require additional training, are faster and do not involve additional tuning while achieving better results both when combined with pretrained and randomly initialized text encoders.

Domains

Engineering Sciences [physics]

Fichier principal

Learning.pdf (884.69 Ko)

Origin : Publisher files allowed on an open archive
licence : CC BY SA - Attribution - ShareAlike

Imène DENINE : Connect in order to contact the contributor

https://hal.science/hal-04540314

Submitted on : Thursday, April 11, 2024-9:31:52 AM

Last modification on : Friday, May 17, 2024-3:08:03 PM

Dates and versions

hal-04540314 , version 1 (11-04-2024)

Identifiers

HAL Id : hal-04540314 , version 1
DOI : 10.18653/v1/2022.acl-long.187

Cite

Pierre Colombo, Guillaume Staerman, Nathan Noiry, Pablo Piantanida. Learning Disentangled Textual Representations via Statistical Measures of Similarity. 60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin (Ireland), Ireland. pp.2614-2630, ⟨10.18653/v1/2022.acl-long.187⟩. ⟨hal-04540314⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS SUP_LSS PARISTECH SUP_TELECOMS CENTRALESUPELEC UNIV-PARIS-SACLAY LTCI IDS S2A IP_PARIS GS-COMPUTER-SCIENCE GS-SPORT-HUMAN-MOVEMENT HUB-IA ILLS

15 View

3 Download

Learning Disentangled Textual Representations via Statistical Measures of Similarity

Abstract

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share