SUPPORTING INFORMATION RETRIEVAL IN RSS FEEDS - CentraleSupélec Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

SUPPORTING INFORMATION RETRIEVAL IN RSS FEEDS

Georges Dubus
  • Fonction : Auteur
Mathieu Bruyen
  • Fonction : Auteur
Nacéra Bennacer Seghouani

Résumé

Really Simple Syndication (RSS) information feeds present new challenges to information retrieval technologies. In this paper we propose a RSS feeds retrieval approach which aims to give for an user a personalized view of items and making easier the access to their content. In our proposal, we define different filters in order to construct the vocabulary used in text describing items feeds. This filtering takes into account both the lexical category and the frequency of terms. The set of items feeds is then represented in a m-dimensional vector space. The k-means clustering algorithm with an adapted centroid computation and a distance measure is applied to find automatically clusters. The clusters indexed by relevant terms can so be refined, labeled and browsed by the user. We experiment the approach on a collection of items feeds collected from news sites. The resulting clusters show a good quality of their cohesion and their separation. This provides meaningful classes to organize the information and to classify new items feeds.

Domaines

Web
Fichier non déposé

Dates et versions

hal-00493864 , version 1 (21-06-2010)

Identifiants

  • HAL Id : hal-00493864 , version 1

Citer

Georges Dubus, Mathieu Bruyen, Nacéra Bennacer Seghouani. SUPPORTING INFORMATION RETRIEVAL IN RSS FEEDS. 6th International Conference on Web Information Systems and Technologies. WEBIST 2010, Apr 2010, Valencia, Spain. pp.307-312. ⟨hal-00493864⟩
128 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More