Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification - CentraleSupélec Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification

Résumé

Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words (GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, out-performing existing frequency-based criteria.
Fichier principal
Vignette du fichier
fusing-document-collection.pdf (244.67 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01848880 , version 1 (25-07-2018)

Identifiants

Citer

Konstantinos Skianis, Fragkiskos Malliaros, Michalis Vazirgiannis. Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification. NAACL-HLT Workshop on Graph-Based Natural Language Processing (TextGraphs), Jun 2018, New Orleans, Louisiana, United States. ⟨10.18653/v1/w18-1707⟩. ⟨hal-01848880⟩
300 Consultations
477 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More