|
MeanMachine
Mean-Machine is a graphical tool for bayes-swarm data visualization
IntroductionMean Machine is an experimental sub-project, named as the famous Sugar Ray's song, it allows to perform queries on the database and plot data accordingly. DetailsBefore using the graphical frontend, data has to be indexed with the Xapian library, this can be performed with xapian_index.rb script, found in our spidering library named Pulsar. Xapian not only saves which words belong to which text, but also their position in the text, this allows to patterns like "clinton NEAR health" or even more complex ones. VisualizationThere are currently two different components, which perform different plots. searchThe user can insert terms to be searched, as a result the most relevant documents are presented together with a cloud of the most relevant words. Clicking on the words allows to deepen the search and find the documents which match both words (to be precise the two terms are ORed, so documents which match both are given a higher score). graphThe user can insert terms to be searched, the program plots a graph of the most connected words, ie the words which are more likely found with what the user entered. By adjusting the two sliders, it is possible to include/exclude other terms according to their weight in the search or the strenght to which they are connected to others. Graphs can be exported in different formats. What do I need to try it?
UbuntuInstall xapian, its python bindings, gtkhml2 and igraph (enable Ubuntu repository from igraph homepage) sudo apt-get install python-xapian python-gtkhtml2 python-igraph Get the source code cd ~ svn checkout http://bayes-swarm.googlecode.com/svn/trunk/mean-machine mean-machine Download and uncompress the sample db cd ~ mkdir xapian-pagestore cd xapian-pagestore wget http://www.battlehorse.net/swarm_sample/2008_01xap.tar.bz2 tar xvf 2008_01xap.tar.bz2 Finally run it! cd ~/mean-machine python mean-machine.py WindowsInstall python, pygtk (and all that is needed for it to run), gtkhtml2 and igraph: Download and uncompress the sample db, run it! |
Sign in to add a comment


