Overview
Spinn3r is a web service for indexing the blogosphere. We provide raw access to every post being published in real time. We provide the data and you can focus on building your application.
Spinn3r handles all the difficult tasks of running a spider/crawler including spam prevention, language categorization, ping indexing, and trust ranking.
You can read more about spinn3r at http://spinn3r.com
Reference Client
This projects implements client bindings to access the Spinn3r web service.
Right now this just provides Java bindings but we plan porting this implementation to Python, Perl, Ruby, etc (and we'd love help from the Open Source community).
All of our drivers will be released under the Apache 2.0 license. The APL is a very liberal license and basically allows customers and researchers using the Spinn3r API to build whatever type of application they want on top of our platform without having to worry about legal and licensing implications.
Another interesting property of this implementation is that it's very small and clean. This means ports to other languages should be very easy.
Documentation
See the permalink API or our source API or our wire protocol and Javadoc
Basics
Each API represents a top level object collection and supports the following methods:
- getDelta() - Find all new items based on after and/or before parameters which are ISO 8601 timestamps.
- entry() - Fetch HTML for a unique item based on URL or a unique identifier.
- history() - Fetch the history of items for a given source or feed.
Permalink API
Spinn3r fetches the full HTML of every post published in the blogosphere.
Supports getDelta(), entry(), and history() methods.
Feed API
Index of all RSS and Atom content included within Spinn3r. The Permalink API also includes this content so the feed API is only valuable for customers who specifically want to index RSS and might want slightly faster resolution and a bit higher performance.
Supports getDelta(), entry(), and history() methods.
Comment API
We are currently beta testing an API for indexing comments published in the blogosphere.
Supports getDelta(), entry(), and history() methods.
Link API
Indexes new links as they are published throughout the blogosphere.
Supports getDelta(), entry(), and history() methods.
Other Implementations
There is an unofficial native Perl client. The API design differs slightly from the official reference client but is compatible with the Spinn3r API.