Abstract : Word alignments identify translational correspondences between words in a parallel sentence pair and are used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems or to perform quality estimation. Variational autoencoders have been recently used in various of natural language processing to learn in an unsupervised way latent representations that are useful for language generation tasks. In this paper, we study these models for the task of word alignment and propose and assess several evolutions of a vanilla variational autoencoders. We demonstrate that these techniques can yield competitive results as compared to Giza++ and to a strong neural network alignment system for two language pairs.
https://hal.archives-ouvertes.fr/hal-02949042
Contributeur : Anh Khoa Ngo Ho <>
Soumis le : vendredi 25 septembre 2020 - 12:07:17 Dernière modification le : vendredi 26 février 2021 - 16:30:50 Archivage à long terme le : : jeudi 3 décembre 2020 - 17:45:16
Anh Khoa Ngo Ho, François Yvon. Generative latent neural models for automatic word alignment. Association for Machine Translation in the Americas, Oct 2020, Miami, Florida, United States. pp.64-77. ⟨hal-02949042⟩