My favorites | Sign in
Project Home Downloads Wiki Issues Source
Project Information
Members

vertical search engine for library websites

requires a patched solrpy http://code.google.com/r/briantinglecdliborg-solrpy/source/browse/solr/paginator.py that supports highlighting

requires a patched nutch 1.3 that supports -numFetchers so it can utilize the max maps capacity

lucidworks solr http://www.lucidimagination.com/products/certified/solr

needs j2EE (tomcat) and wsgi (apache2/mod_wsgi) application servers to serve front end site. will do: investigate migrating to solr's new built in velocity templates, so it will be a one server app

steps to run the crawl on cloudera hadoop in ec2 on ebs https://gist.github.com/1126909 <--

remove a site from the index https://gist.github.com/1129533

screen cast of hadoop running http://www.screenr.com/Dids

Powered by Google Project Hosting