My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
PageName  
SlytheRSS - an RSS Scraper
Updated Sep 8, 2008 by databeast

Introduction

The goal of Slytherss is to perform as a semantically-aware RSS scraper, with the ideal of scouring a number of RSS feeds and extracting only those articles of interest to the user. Articles should be prioritized based upon their global popularity (in a similar fashion to how Google News extracts the most talked-about stories of the day) and by per-user correlations, using a mix of keyword context and semantic similarity to user-selected articles of interest (in a similar fashion to the Digg suggestion engine).

As a test set of data for initial development, slytherss is written around the concept of scraping hundreds of information security related feeds, and reducing them down to only those articles of particular relevance to the security threats faced by the users particular network installation. Ideally however, the project should be adaptable to any particular subject of interest (assuming sufficient public newsfeeds on the topic).


Documentation and Planning


Slytherss Mailing List


Sign in to add a comment
Powered by Google Project Hosting