My favorites | Sign in
Project Logo
             
Search
for
Updated Jan 25, 2009 by matteo.zandi
MeanMachine  
Mean-Machine is a graphical tool for bayes-swarm data visualization

Introduction

Mean Machine is an experimental sub-project, named as the famous Sugar Ray's song, it allows to perform queries on the database and plot data accordingly.

Details

Before using the graphical frontend, data has to be indexed with the Xapian library, this can be performed with xapian_index.rb script, found in our spidering library named Pulsar.

Xapian not only saves which words belong to which text, but also their position in the text, this allows to patterns like "clinton NEAR health" or even more complex ones.

Visualization

There are currently two different components, which perform different plots.

search

The user can insert terms to be searched, as a result the most relevant documents are presented together with a cloud of the most relevant words. Clicking on the words allows to deepen the search and find the documents which match both words (to be precise the two terms are ORed, so documents which match both are given a higher score).

graph

The user can insert terms to be searched, the program plots a graph of the most connected words, ie the words which are more likely found with what the user entered. By adjusting the two sliders, it is possible to include/exclude other terms according to their weight in the search or the strenght to which they are connected to others.

Graphs can be exported in different formats.

What do I need to try it?

Ubuntu

Install xapian, its python bindings, gtkhml2 and igraph (enable Ubuntu repository from igraph homepage)

sudo apt-get install python-xapian python-gtkhtml2 python-igraph

Get the source code

cd ~
svn checkout http://bayes-swarm.googlecode.com/svn/trunk/mean-machine mean-machine

Download and uncompress the sample db

cd ~
mkdir xapian-pagestore
cd xapian-pagestore
wget http://www.battlehorse.net/swarm_sample/2008_01xap.tar.bz2
tar xvf 2008_01xap.tar.bz2

Finally run it!

cd ~/mean-machine
python mean-machine.py

Windows

Install python, pygtk (and all that is needed for it to run), gtkhtml2 and igraph:

Download and uncompress the sample db, run it!


Sign in to add a comment
Hosted by Google Code