My favorites | Sign in
Project Home Downloads Issues Source
Project Information
Members
Featured
Downloads

Overview

This software framework provides basic code for helping the process of designing and implementing recommender engines.

For end-users

Usage examples

Flickr example to come soon.

Documentation

API/Javadoc documentation is distributed with the source package and is also available online.

Details

When studying recommender algorithms it is necessary to carry on repetitive tasks such as validating results with multiple subsets of training data so as to increase results confidence, representing results graphically in several ways as an aid for analysis etc.

Thus it becomes necessary to have an infrastructure in terms of tools, reusable code modules, procedures and data standards, all of which could be represented as a framework, which is described in this section. The following diagram shows the data flow and main processes necessary for such a framework.

The framework can be broken down into these main concepts:

Input: responsible for loading data describing relationship among entities on a given social network. Such loading can have as a source pure text files, YAML, XML or relational databases. An extension of the data input interface can also read relationships using open standards for describing social media. Such queries to RDF databases can be done using the SPARQL language for item selection for example.

Filter: Applies a filter over input data in order to make projections (reduce data dimensionality), summarize, fill missing data with averages, add noise, normalize ratings, perform data type transformations, convert unique domain-specific IDs into sequential integers, which some recommending engines take as input etc. Several filters can be composed in order to achieve more complex data transformations. Write data to a file: Persists filtered input data or experiment results to disk for later processing, graphical representation etc.

Save and read training: Persists on and reads from disk a learned model for the recommender. This process offers few possibilities for extensions due to the high degree of variation found on serialization needs for each recommender algorithm.

Recommend: Returns recommended items based on a previously learned recommendation model.

Measure performance: based on input data and its attributes (total volume, training rate etc), and the recommendations made, this process should calculate several performance measures such as recall, precision and its derivatives (like F-measure), medium absolute error, medium square error etc Train recommender: Establishes a base interface for training and managing the progress of this process. Plot results: Fed with experiments performance data, input metadata (training ratio, total volume, specific parameters etc), this process is responsible for plotting simple graphics, histograms or scatter plots for multivariate analysis.

Plot results: Represents performance results and experiment parameters graphically.

For framework developers

You need JDK 1.5.

Download and setup Maven.

Checkout the source code (RecFwk module) from the subversion repository.

To compile

mvn compile

To unit test

mvn test

To run javadoc, checkstyle and generate distribution/release files

Download and build UMLGraph.jar, then install into your local Maven repo:

mvn install:install-file -DgroupId=umlgraph -DartifactId=UMLGraph -Dversion=5.2 -Dpackaging=jar -Dfile=UmlGraph.jar

and then to generate the release packages:

mvn package

To install on you local Maven repository for usage by other Maven projects:

mvn install

Every once in a while you may need to clean temp and target files:

mvn clean

Misc

To generate LaTeX Javadoc, install the TexDoclet

mvn install:install-file -Dfile=etc/texdoclet.jar -DgroupId=org.wonderly.doclets  -DartifactId=texdoclet -Dversion=1.2 -Dpackaging=jar

then run

mvn site

The generated LaTeX report should be at target/site/tex/docs.tex

Powered by Google Project Hosting