Export to GitHub

airhead-research - issue #94

Current LSA defaults to writing the term-doc matrix in matlab format even when SVDLIBJ is used


Posted on Apr 8, 2011 by Happy Rhino

What steps will reproduce the problem? 1. Run LSA on a system without svdlibc

What is the expected output? What do you see instead?

Without svdlibc installed, Matrices.getMatrixBuilderForSVD() returns a MatlabSparseMatrixBuilder, which is an inefficient format if the process is going to be using SVDLIBJ. There's no technical problem with the code, but it results in an unnecessary format conversion in SVD.java, and it's slower to write. Perhaps we should revisit the getMatrixBuilder() call in light of when svdlibc isn't installed. (The main issue would be when the file actually does need to be in matlab format, as we would need to somehow select the right builder.)

This issue is just so we can officially track it.

Comment #1

Posted on Sep 20, 2011 by Quick Horse

I made a branch on github to try and refactor the SVD code into interfaces and implementing classes, mostly so that I could swap implementations of NMF in place of SVD easily. And looking at my refactoring, it looks like this issue will be fixed whenever we get to that code review.

Status: Started

Labels:
Type-Enhancement Priority-Low