
crawler-commons
Overview
crawler-commons is a set of reusable Java components that implement functionality common to any web crawler. These components benefit from collaboration among various existing web crawler projects, and reduce duplication of effort.
Crawler-Commons News
22nd April 2015 - crawler-commons has moved
The crawler-commons project is now being hosted at GitHub, due to the demise of Google code hosting. Please go to https://github.com/crawler-commons/crawler-commons for the latest news, issues, documentation and code.
Project Information
- License: Apache License 2.0
- 71 stars
- svn-based source control