

heyDr is a multi-thread vertical crawler/spider framework.

Developer can develop a vertical crawler/spider lightly and quickly within it.heyDr separate crawling process into three parts include url-crawling,info-crawling,info-structuring&converging.

And now it support to be deployed on a distributed envirenment in v1.1.

current version:1.1

you can get source code with svn from this url below:


heyDr-master-demo(data center) is used for controlling all the data in distributed envirenment as a server demo and it communicates with spring rmi.

heyDr-slaver-demo is a simple demo for crawler job.

Project Information

The project was created on Jan 4, 2013.

Spider Crawler SearchEngine