Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification

Abstract : Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words (GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, out-performing existing frequency-based criteria.
Complete list of metadatas

Cited literature [43 references]  Display  Hide  Download

https://hal-centralesupelec.archives-ouvertes.fr/hal-01848880
Contributor : Fragkiskos Malliaros <>
Submitted on : Wednesday, July 25, 2018 - 12:02:11 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:28 PM
Long-term archiving on : Friday, October 26, 2018 - 1:32:55 PM

File

fusing-document-collection.pdf
Files produced by the author(s)

Identifiers

Citation

Konstantinos Skianis, Fragkiskos Malliaros, Michalis Vazirgiannis. Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification. NAACL-HLT Workshop on Graph-Based Natural Language Processing (TextGraphs), Jun 2018, New Orleans, Louisiana, United States. ⟨10.18653/v1/w18-1707 ⟩. ⟨hal-01848880⟩

Share

Metrics

Record views

428

Files downloads

176