Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3)

Abstract : We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures. In particular, we show that we can build variants of our parser with smaller search spaces and time complexities ranging from O(n^6) down to O(n^3). The cubic time variant covers 98% of constituents observed in linguistic treebanks while having the same complexity as continuous constituency parsers. We evaluate our approach on German and English treebanks (Negra, Tiger, and DPTB) and report state-of-the-art results in the fully supervised setting. We also experiment with pre-trained word embeddings and Bertbased neural networks.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-03029253
Contributeur : Caio Corro <>
Soumis le : samedi 28 novembre 2020 - 02:32:49
Dernière modification le : lundi 22 février 2021 - 16:21:17

Fichier

2020.emnlp-main.219.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : hal-03029253, version 1

Collections

Citation

Caio Corro. Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3). Empirical Methods in Natural Language Processing, Nov 2020, Punta Cana (virtual), Dominican Republic. ⟨hal-03029253⟩

Partager

Métriques

Consultations de la notice

42

Téléchargements de fichiers

12