My favorites | Sign in
Project Logo
                
Search
for
Updated Feb 07, 2008 by sanford.poon
Labels: Phase-Deploy
DatafileGenerationLinux  
  1. Download the latest wikipedia dump from http://download.wikimedia.org/backup-index.html (You only need the " Articles, templates, image descriptions, and primary meta-pages." dump i.e. "pages-articles.xml.bz2")
  2. Get the source of the Wikipedia app from http://collison.ie/wikipedia-iphone/ (the version I used was 0.1 so here), put it in a linux box inside a clean folder
  3. Extract the source package with the code
  4. bzcat -d wikipedia-iphone-0.1.tar.bz2 > wikipedia-iphone-0.1.tar
  5. Untar
  6. tar -xvf wikipedia-iphone-0.1.tar
  7. Modify indexer.c so it can be compiled under gcc 4.x for linux (comment out line 30)
  8. Line 30:  //  NEXT_NODE(node, cmp) = ++storepos;
  9. Compile
  10.  cd c
     ./bootstrap.sh
     ./configure
     make
  11. Get locate, locate.code, locate.bigram from here (for i386 architecture, other machines try finding the appropriate findutils package and extract the binaries), put it into some directory in the linux box
  12. Get mklocatedb.sh here and change $LIBEXECDIR on Line 35 to where you have put the findutils binaries
  13. Line 35:  ${LIBEXECDIR:=____your_directory_here____}; export LIBEXECDIR
  14. Install Ruby (on CentOS use the following commands as root)
  15.  yum install -y ruby
     yum install -y ruby-devel ruby-docs ruby-ri ruby-irb ruby-rdoc
  16. Install Rubygems - download, untar the tgz package from http://rubyforge.org/frs/?group_id=126&release_id=17305 and install (as root) (for non-root installations see http://www.rubygems.org/read/chapter/3#page83)
  17. ruby setup.rb
  18. Install RubyInline
  19. gem install RubyInline
  20. Modify sh/process to point to mklocatedb.sh
  21. Line 29:  cat $ifile | LC_ALL=C ____your_path_to_mklocatedb.sh____/mklocatedb.sh > $sfile
  22. Chmod the files you downloaded
  23.  chmod 755 mklocatedb.sh
     chmod 755 locate
     chmod 755 locate.code
     chmod 755 locate.bigram
  24. Run sh/process
  25. cd sh
     ./process ____path_to_datafile_source____
  26. Should any step in sh/process fail, the following files will still be generated, ls -lrt to see if any of them has 0 byte size and delete, troubleshoot that step and rerun
  27.  *pages-articles.xml.bz2.processed
     *pages-articles.xml.bz2.index.txt
     *pages-articles.xml.bz2.locate.db
     *pages-articles.xml.bz2.blocks.db
     *pages-articles.xml.bz2.locate.prefixdb
  28. When it's all done, rename the files so they become: (the index.txt can be discarded)
  29.  processed
     locate.db
     blocks.db
     locate.prefixdb
  30. Transfer the 4 files to /var/root/wp, start the application and enjoy!

Comment by Telepenin, May 25, 2008

not work for russian dump of wiki :( probably wrong character encoding

Comment by ppstay, Aug 22, 2008

How to get this work on 2.0?


Sign in to add a comment
Hosted by Google Code