|
Project Information
Links
|
重要更新: svd++ 和 combine(svd++和lfGNbr的结合)算法有重要bug,在更新sumQE的时候不对,现在已经修正,请大家更新代码svn update. 更新的内容如下:"sumQE[K_NUM+1] = eui * q[itemI][k];” 改成 "sumQE[k] += eui * q[itemI][k];" important update: There are very big bugs on svd++ model and combine model(svd++ and latent factor global neighborhood model! Please update your code using "svn update" command! The details of change is: "sumQE[K_NUM+1] = eui * q[itemI][k];" was changed to "sumQE[k] += eui * q[itemI][k];" 项目发起的缘由 为了减少推荐系统领域的朋友入门的难度,我将一些推荐算法的细节展现出来,通过代码的形式呈现给大家,给大家一个好的参照,使大家能尽快上手,减小入门的门槛,希望能为推荐系统领域的发展尽一些绵薄之力!希望有更多的人研究这个有趣且有用的领域! 代码说明: 一些有用的链接 希望有更多的人加入这个project,将更多的算法代码贡献出来,比如目前尚缺RBM model,temporal model的实现 ps: 这也是我的第一个开源项目,用了这么多的开源软件,今天算是迈出了回馈开源界的第一步,以后如果有好的东西我也会分享给大家 English version:Why start this project? I encountered a lot of difficulties when I implement the classic algorithms of recommender system: (1) for large-scale data (the netflix dataset,100M scoring data), the way of arbitrary using the cpu and memory does not work. Because the large amount of data, algorithms and data structures should be compactly designed to avoid too large time and space consumption to accept. (2) data initialization and parameter setting has a great influence on the results, in order to reproduce the results of koren, my first svd model implementation took about 2 weeks, including 4 days of tuning parameters. (3) some other difficulties, ...... In order to reduce the difficulty of entering the field of recommendation system. I provide some details of the algorithms as long as the koren's papers in the form of code. So that the newbie of recommender system can get started as soon as possible.And also give the friends in the recommender system a good reference. Also hope there are more and more people to enter this interesting and useful area! Code Description: (2) The code released under the license GPL V3. Please retain the copyright information when use the code. (3)Current finished algorithms: baseline predictor, knn, svd, svd + +, asymmetric-svd, global neighborhood based model (gNbr), combine of svd + + and gNbr. (4) code style (5) the code must have a lot of imperfections and mistakes here, if you find some bug or mistake, please email me. I also hope you join me to perfect this project. (6)all the code are tested under the enviroment of debain 6.0 and RHEL AS4(gcc 3.4.6). Some useful links: (2) The way to get the training set and test set of netflix dataset:netflix dataset preprocessing (3) the steps of using knn model in this project (4) the problems I encountered in the implementation (5) the results of knn algorithm (6) the results of svd algorithm (7) the datasets available now (8) Recommender system Handbook download (9) some papers related to this project Hope more people to join this project, contribute more code to this project. Such as the current shortfall: the implimentation of RBM model and temporal model Friends who want to join this project can contact me here, or contact me directly via email, honglianglv at gmail ps: This is my first open source project after I have benefited from open source softwear for so many years. This is my first step on the way of open source and I hope to contribute more in the future. |