Social Bookmarks Friend Finder is a GUI application that allows you to find users with similar bookmarks to the user specified.
It is similar to: http://www.aiplayground.org/artikel/delicious-mates/ but:
- it is coded in Java (instead of Python)
- does not require login as it does not use the public delicious API (it scrapes web pages since there is not unauthenticated API)
- it uses a SQL database to keep the large amount of data needed (delicious-mates keeps all the data in memory and that is too much for many users)
- it can be shut down and restarted at any time to continue gathering data
- it uses a Swing GUI
Requirements
- For a delicious user with 700 links you will need 300 MiB of storage for the default h2database.
- Java
Running
Go to a terminal and type:
java -jar sbff-1.2-jar-with-dependencies.jar delicioususer
To get a terminal in Microsoft Windows you have to go to Start -> Run... -> type "cmd" and click OK -> then type "cd <folder_with_downloaded_jar>"
delicioususer is the user you want to find friends for.
Using another database (optional)
You may use a different SQL database instead of the included H2 Database. I know MySQL 5.1 works with sbff.
Then create a database (and possibly an user with full access to that database) and set up a jdbc.properties like this:
jdbc.driverClassName=com.mysql.jdbc.Driver jdbc.url=jdbc:mysql://localhost:3306/sbff jdbc.username=myuser jdbc.password=mypassword
Compiling
Get maven2 and type: mvn package
Algorithm
This is the algorithm followed:
- Get the user page to know how many links the user has. (fastest pages_to_get = 1)
- If the links stored in the DB is different than the links count, then retrieve and store all the links in DB. (pages_to_get = number_of_links/100)
- For every link download the users that have that link. (pages_to_get = number_of_links*40)
- Get the user link count for the users with the most matching urls. (pages_to_get = 200)
- Get a user list ordered by the ratio of matching urls divided by the number of links that user has.
Databases known not to work with sbff
These embedded databases are known not to work with sbff because they are buggy:
These are the bugs that sbff triggers: