- Download the latest wikipedia dump from http://download.wikimedia.org/backup-index.html (You only need the " Articles, templates, image descriptions, and primary meta-pages." dump i.e. "pages-articles.xml.bz2")
- Get the source of the Wikipedia app from http://collison.ie/wikipedia-iphone/ (the version I used was 0.1 so here), put it in a linux box inside a clean folder
- Extract the source package with the code
bzcat -d wikipedia-iphone-0.1.tar.bz2 > wikipedia-iphone-0.1.tar - Untar
tar -xvf wikipedia-iphone-0.1.tar - Modify indexer.c so it can be compiled under gcc 4.x for linux (comment out line 30)
Line 30: // NEXT_NODE(node, cmp) = ++storepos; - Compile
cd c
./bootstrap.sh
./configure
make - Get locate, locate.code, locate.bigram from here (for i386 architecture, other machines try finding the appropriate findutils package and extract the binaries), put it into some directory in the linux box
- Get mklocatedb.sh here and change $LIBEXECDIR on Line 35 to where you have put the findutils binaries
Line 35: ${LIBEXECDIR:=____your_directory_here____}; export LIBEXECDIR- Install Ruby (on CentOS use the following commands as root)
yum install -y ruby
yum install -y ruby-devel ruby-docs ruby-ri ruby-irb ruby-rdoc - Install Rubygems - download, untar the tgz package from http://rubyforge.org/frs/?group_id=126&release_id=17305 and install (as root) (for non-root installations see http://www.rubygems.org/read/chapter/3#page83)
ruby setup.rb - Install RubyInline
gem install RubyInline - Modify sh/process to point to mklocatedb.sh
Line 29: cat $ifile | LC_ALL=C ____your_path_to_mklocatedb.sh____/mklocatedb.sh > $sfile - Chmod the files you downloaded
chmod 755 mklocatedb.sh
chmod 755 locate
chmod 755 locate.code
chmod 755 locate.bigram - Run sh/process
cd sh
./process ____path_to_datafile_source____ - Should any step in sh/process fail, the following files will still be generated, ls -lrt to see if any of them has 0 byte size and delete, troubleshoot that step and rerun
*pages-articles.xml.bz2.processed
*pages-articles.xml.bz2.index.txt
*pages-articles.xml.bz2.locate.db
*pages-articles.xml.bz2.blocks.db
*pages-articles.xml.bz2.locate.prefixdb - When it's all done, rename the files so they become: (the index.txt can be discarded)
processed
locate.db
blocks.db
locate.prefixdb - Transfer the 4 files to /var/root/wp, start the application and enjoy!
|
not work for russian dump of wiki :( probably wrong character encoding
How to get this work on 2.0?