|
HowToUse
Explains how to compile and use MrsRF.
MrsRF is a multi-core algorithm utilizing MapReduce to calculate large Robinson Foulds(RF) distance matrices. The input to MrsRF is a NewickTree file containing t trees. The output is a t x t distance matrix. To compileFrom the home directory:
List of Created ExecutablesIn src\:
UsageTo run MrsRF on N nodes and c cores, run the following command in the main directory: mpirun -np <N> src/mrsrf <cores> <input file> <number of taxa> <number of trees> <output> <rounds> Where:
For example, to run mrsrf on 2 nodes and 4 cores, using an input file specified by test.tre which contains 12 trees with 10 taxa each, run the following: mpirun -np 2 src/mrsrf 4 test.tre 10 12 1 1 This should produce the following RF Matrix: 0 2 1 5 5 5 5 5 5 5 1 5 2 0 1 5 4 5 4 5 4 5 1 5 1 1 0 5 5 5 5 5 5 5 0 5 5 5 5 0 1 1 2 1 2 0 5 1 5 4 5 1 0 2 1 2 1 1 5 2 5 5 5 1 2 0 1 1 2 1 5 1 5 4 5 2 1 1 0 2 1 2 5 2 5 5 5 1 2 1 2 0 1 1 5 0 5 4 5 2 1 2 1 1 0 2 5 1 5 5 5 0 1 1 2 1 2 0 5 1 1 1 0 5 5 5 5 5 5 5 0 5 5 5 5 1 2 1 2 0 1 1 5 0 In the MrsRF release, you will find a sample tree file called 10taxa-12trees.tre. By running MrsRF on that sample file, you should see the above RF matrix produced. NotesIf the code does not compile out of box, it may be necessary to reset the CPU_SET definitions in MapReduceScheduler.c. This is due to architectural differences we have encountered over different machines.
To compare the matrix your program genereates with ours, a Perl script called compare_files.pl is provided in the peanuts folder. This script can be used to compare two matrices. Alternatively, you can use: diff -iB matrix1 matrix2 where matrix1 and matrix2 are the files containing the two matrices to be compared, respectively. |