My favorites | Sign in
Project Logo
                
Details: Show all Hide all

Earlier this year

  • Nov 23, 2009
    ToDo (Things left to be done.) Wiki page commented on by yolabingo   -   Here's a tarball of the code I'm using. README file has some notes. Hope it's useful for you. http://www.lincolnlattonsoftware.com/safebrowsing.tar.gz
    Here's a tarball of the code I'm using. README file has some notes. Hope it's useful for you. http://www.lincolnlattonsoftware.com/safebrowsing.tar.gz
  • Nov 22, 2009
    ToDo (Things left to be done.) Wiki page commented on by thejaswi...@gmail.com   -   @yolabingo: Yes, I would love to hear from you on the changes. I actually use postgres and it returns results within 3 seconds. There is already a patch that uses a key-value store(memcached), which gives blazing fast results. I would like to actually love to benchmark both and decide for myself.
    @yolabingo: Yes, I would love to hear from you on the changes. I actually use postgres and it returns results within 3 seconds. There is already a patch that uses a key-value store(memcached), which gives blazing fast results. I would like to actually love to benchmark both and decide for myself.
  • Nov 20, 2009
    ToDo (Things left to be done.) Wiki page commented on by yolabingo   -   Thanks for sharing this helpful code. I use it daily to check about 10000 urls. On pretty new hardware, this script would take about 10 minutes to execute. I made some ugly hacks to copy the url_hashes_table to an in-memory sqlite DB before querying, and running time is down to about 15 seconds. I'd be happy to share the changes I've made - let me know.
    Thanks for sharing this helpful code. I use it daily to check about 10000 urls. On pretty new hardware, this script would take about 10 minutes to execute. I made some ugly hacks to copy the url_hashes_table to an in-memory sqlite DB before querying, and running time is down to about 15 seconds. I'd be happy to share the changes I've made - let me know.
  • Oct 02, 2009
    r41 (Small changes to the django forms to remove hardcoding) committed by thejaswi...@gmail.com   -   Small changes to the django forms to remove hardcoding
    Small changes to the django forms to remove hardcoding
  • Aug 15, 2009
    issue 10 (Simple patch to support key/value store (memcached)) reported by adulau   -   As I needed to lookup a lot of URLs in safebrowsing, I wanted to have a fast in-memory lookup. So here is a quick-and-dirty patch to add key/value store (memcached) in safebrowsing-python. Two important notes : - I didn't change the prepare_db.py. As the current update of the key/store database is done using a script like this : import memcache mc = memcache.Client(['127.0.0.1:11211'], debug=0) malware = open("goog-malware.txt") i=0 for line in malware: if i>0: key = line[1:-1] key.rstrip().rstrip() if key[:-1] is not None: mc.set(key[:-1], "M") i=i+1 But if you want to incorporate it in prepare_db in a clean way, feel free. - The behaviour of the lookup is a bit different as when the URL is not matching the database, it will return a False. (makes more sense to me but this is a matter of taste ;-) The advantage of the key/value storage : it's the speed, a small example with 3 URLs (2 not matching and 1 matching) : sqlite3 memcached real 0m6.432s real 0m0.054s user 0m5.610s user 0m0.040s sys 0m0.720s sys 0m0.020s The second is when a running a lot of processes in parallel to do the lookup you can continue to update the key/value store without real impact on the lookup cost. If you have other ideas or comments on the patch, don't hesitate. Thanks a lot.
    As I needed to lookup a lot of URLs in safebrowsing, I wanted to have a fast in-memory lookup. So here is a quick-and-dirty patch to add key/value store (memcached) in safebrowsing-python. Two important notes : - I didn't change the prepare_db.py. As the current update of the key/store database is done using a script like this : import memcache mc = memcache.Client(['127.0.0.1:11211'], debug=0) malware = open("goog-malware.txt") i=0 for line in malware: if i>0: key = line[1:-1] key.rstrip().rstrip() if key[:-1] is not None: mc.set(key[:-1], "M") i=i+1 But if you want to incorporate it in prepare_db in a clean way, feel free. - The behaviour of the lookup is a bit different as when the URL is not matching the database, it will return a False. (makes more sense to me but this is a matter of taste ;-) The advantage of the key/value storage : it's the speed, a small example with 3 URLs (2 not matching and 1 matching) : sqlite3 memcached real 0m6.432s real 0m0.054s user 0m5.610s user 0m0.040s sys 0m0.720s sys 0m0.020s The second is when a running a lot of processes in parallel to do the lookup you can continue to update the key/value store without real impact on the lookup cost. If you have other ideas or comments on the patch, don't hesitate. Thanks a lot.
  • Aug 12, 2009
    r40 (Added a contributors name) committed by thejaswi...@gmail.com   -   Added a contributors name
    Added a contributors name
  • Aug 11, 2009
    issue 9 (function lookup_by_md5 (query_lookup.py) is never called (an...) commented on by adulau   -   Thanks a lot for the quick fix. I'm currently adding a functionality to store the hash value into a key/value store. I'm adding it because I have a lot of URLs to check (due to large logs files) while using concurrent access to the store. The key/value store is easier to put in memory for doing lookups. I'll create a patch when it's more stable. Thanks for your work. Alexandre Dulaunoy
    Thanks a lot for the quick fix. I'm currently adding a functionality to store the hash value into a key/value store. I'm adding it because I have a lot of URLs to check (due to large logs files) while using concurrent access to the store. The key/value store is easier to put in memory for doing lookups. I'll create a patch when it's more stable. Thanks for your work. Alexandre Dulaunoy
  • Aug 11, 2009
    issue 9 (function lookup_by_md5 (query_lookup.py) is never called (an...) Status changed by thejaswi...@gmail.com   -   Thanks for spotting the bug. Fixed in r39. I have also added a small comment explaining that the helper is currently unused but may in the future. Kindly provide your name to appear in the CONTRIBUTORS.txt.
    Status: Fixed
    Thanks for spotting the bug. Fixed in r39. I have also added a small comment explaining that the helper is currently unused but may in the future. Kindly provide your name to appear in the CONTRIBUTORS.txt.
    Status: Fixed
  • Aug 11, 2009
    r39 (Fixed ticket #9 which had a wrong SQL query) committed by thejaswi...@gmail.com   -   Fixed ticket #9 which had a wrong SQL query
    Fixed ticket #9 which had a wrong SQL query
  • Aug 11, 2009
    issue 9 (function lookup_by_md5 (query_lookup.py) is never called (an...) reported by adulau   -   lookup_by_md5 is not called and contains an invalid query : cursor.execute("SELECT * FROM url_hashes_table url_hash='%s';"%(self.md5_hash)) WHERE is missing from the SQL statement. version : r38 (SVN checkout) Maybe better to remove the lookup_by_md5 function to avoid any confusion while reading the code (especially the first time). Kind regards,
    lookup_by_md5 is not called and contains an invalid query : cursor.execute("SELECT * FROM url_hashes_table url_hash='%s';"%(self.md5_hash)) WHERE is missing from the SQL statement. version : r38 (SVN checkout) Maybe better to remove the lookup_by_md5 function to avoid any confusion while reading the code (especially the first time). Kind regards,
  • Jul 21, 2009
    issue 7 (Lookup class does not handle path correctly) Status changed by thejaswi...@gmail.com   -   First up, thanks for pointing out the bug. The tests managed to point out the exact location of the error. It has been fixed in r37, r38. Regarding the IP address lookup, I have opened another ticket. The project never claimed to do IP lookup, check the ToDo in the wiki. Since it has been a long-pending item, I have opened a ticket on this one. Please contribute patches.
    Status: Fixed
    First up, thanks for pointing out the bug. The tests managed to point out the exact location of the error. It has been fixed in r37, r38. Regarding the IP address lookup, I have opened another ticket. The project never claimed to do IP lookup, check the ToDo in the wiki. Since it has been a long-pending item, I have opened a ticket on this one. Please contribute patches.
    Status: Fixed
  • Jul 21, 2009
    issue 8 (Does not support IP lookup) reported by thejaswi...@gmail.com   -   As per the Google safebrowsing documentation, the IP addresses must be canonicalized and hashed. This is not handled by safebrowsing-python. Patches welcome.
    As per the Google safebrowsing documentation, the IP addresses must be canonicalized and hashed. This is not handled by safebrowsing-python. Patches welcome.
  • Jul 21, 2009
    r38 (Merge branch 'master' into svn) committed by thejaswi...@gmail.com   -   Merge branch 'master' into svn
    Merge branch 'master' into svn
  • Jul 20, 2009
    r37 (Pulled from git-dev branch) committed by thejaswi...@gmail.com   -   Pulled from git-dev branch
    Pulled from git-dev branch
  • Jul 01, 2009
    r36 (Doh. Removed Stupid redundancies) committed by thejaswi...@gmail.com   -   Doh. Removed Stupid redundancies
    Doh. Removed Stupid redundancies
  • Jun 23, 2009
    issue 7 (Lookup class does not handle path correctly) reported by dcg...@gmail.com   -   Current code does not split path, it use only full path to calculate MD5. If the signature is a.b.c/x/, it will miss to detect http://a.b.c/x/y/z.htm The attached file is a code snippet to correct this issue. Refer to Google SB document for details. Additional note, this code does not handle host that is ip format correctly.:-)
    Current code does not split path, it use only full path to calculate MD5. If the signature is a.b.c/x/, it will miss to detect http://a.b.c/x/y/z.htm The attached file is a code snippet to correct this issue. Refer to Google SB document for details. Additional note, this code does not handle host that is ip format correctly.:-)
  • Jun 08, 2009
    r35 (Added another contributor to CONTRIBUTORS.txt) committed by thejaswi...@gmail.com   -   Added another contributor to CONTRIBUTORS.txt
    Added another contributor to CONTRIBUTORS.txt
  • Jun 07, 2009
    issue 5 (Postgres connect string malformed) Status changed by thejaswi...@gmail.com   -   Thank you for the changes.
    Status: Fixed
    Thank you for the changes.
    Status: Fixed
  • Jun 07, 2009
    issue 6 (Postgres DB update does not work) Status changed by thejaswi...@gmail.com   -   I do not forsee any problem that could occur from a commit.
    Status: Fixed
    I do not forsee any problem that could occur from a commit.
    Status: Fixed
  • Jun 07, 2009
    r34 (Fixing issue 6 with postgres not committing into the databas...) committed by thejaswi...@gmail.com   -   Fixing issue 6 with postgres not committing into the database. Thank you Oxcoda once again.
    Fixing issue 6 with postgres not committing into the database. Thank you Oxcoda once again.
  • Jun 07, 2009
    r33 (Fixed an issue with wrong dsn parameters. Thanks Oxcoda. Clo...) committed by thejaswi...@gmail.com   -   Fixed an issue with wrong dsn parameters. Thanks Oxcoda. Closing Issue 5 with this.
    Fixed an issue with wrong dsn parameters. Thanks Oxcoda. Closing Issue 5 with this.
  • Jun 07, 2009
    issue 6 (Postgres DB update does not work) reported by Oxcoda   -   What steps will reproduce the problem? 1. Use fetch_data in prepare_db with Postgres to populate the database What is the expected output? Data committed to the DB. What do you see instead? Nothing in the DB. Please provide any additional information below. There is no commit occurring on the connection. The diff to fix this is below (note: this has NOT been tested with any other DB and may create issues with them): --- prepare_db.py (revision 32) +++ prepare_db.py (working copy) @@ -27,7 +27,8 @@ return URL def fetch_data(self): - cursor = self.backend.connection.cursor() + conn = self.backend.connection + cursor = conn.cursor() cursor.execute("select * from %s_version;" %(self.badware_type)) row = cursor.fetchall() st = string.Template(self.url) @@ -71,4 +72,5 @@ "VALUES ('%s','%s');" %(self.badware_code, url_hash[1:].strip())) cursor.close() + conn.commit() return 0
    What steps will reproduce the problem? 1. Use fetch_data in prepare_db with Postgres to populate the database What is the expected output? Data committed to the DB. What do you see instead? Nothing in the DB. Please provide any additional information below. There is no commit occurring on the connection. The diff to fix this is below (note: this has NOT been tested with any other DB and may create issues with them): --- prepare_db.py (revision 32) +++ prepare_db.py (working copy) @@ -27,7 +27,8 @@ return URL def fetch_data(self): - cursor = self.backend.connection.cursor() + conn = self.backend.connection + cursor = conn.cursor() cursor.execute("select * from %s_version;" %(self.badware_type)) row = cursor.fetchall() st = string.Template(self.url) @@ -71,4 +72,5 @@ "VALUES ('%s','%s');" %(self.badware_code, url_hash[1:].strip())) cursor.close() + conn.commit() return 0
  • Jun 07, 2009
    issue 5 (Postgres connect string malformed) commented on by Oxcoda   -   Here is the diff for the above: @@ -56,9 +56,9 @@ conn_string = "" if not self.db_name: raise Exception("Database name not specified.") - conn_string += "Dbname=%s" %self.db_name + conn_string += "dbname=%s" %self.db_name if self.db_user: - conn_string += "user=%s %s" %(self.db_user, conn_string) + conn_string += " user=%s" %(self.db_user) if self.db_password: conn_string += " password='%s'" %self.db_password if self.db_host:
    Here is the diff for the above: @@ -56,9 +56,9 @@ conn_string = "" if not self.db_name: raise Exception("Database name not specified.") - conn_string += "Dbname=%s" %self.db_name + conn_string += "dbname=%s" %self.db_name if self.db_user: - conn_string += "user=%s %s" %(self.db_user, conn_string) + conn_string += " user=%s" %(self.db_user) if self.db_password: conn_string += " password='%s'" %self.db_password if self.db_host:
  • Jun 07, 2009
    issue 5 (Postgres connect string malformed) reported by Oxcoda   -   What steps will reproduce the problem? 1. Connecting to a postgresql DB What version of the product are you using? On what operating system? psycopg (v1, not v2) CentOS 5.0 SVN version as of 2009-06-07 Please provide any additional information below. Correct implementation in safebrowsing/backend.py (at the insertion of self.db_user): class PostgresqlDbObj(BaseDbObj): def __init__(self): try: import psycopg2 as Database except ImportError: try: import psycopg as Database except ImportError: raise Exception("Libraries psycopg2/psycopg not found.") conn_string = "" if not self.db_name: raise Exception("Database name not specified.") conn_string += "dbname=%s" %self.db_name if self.db_user: conn_string += " user=%s" %(self.db_user) if self.db_password: conn_string += " password='%s'" %self.db_password if self.db_host: conn_string += " host=%s" %self.db_host if self.db_port: conn_string += " port=%s" % self.db_port self.connection = Database.connect(conn_string)
    What steps will reproduce the problem? 1. Connecting to a postgresql DB What version of the product are you using? On what operating system? psycopg (v1, not v2) CentOS 5.0 SVN version as of 2009-06-07 Please provide any additional information below. Correct implementation in safebrowsing/backend.py (at the insertion of self.db_user): class PostgresqlDbObj(BaseDbObj): def __init__(self): try: import psycopg2 as Database except ImportError: try: import psycopg as Database except ImportError: raise Exception("Libraries psycopg2/psycopg not found.") conn_string = "" if not self.db_name: raise Exception("Database name not specified.") conn_string += "dbname=%s" %self.db_name if self.db_user: conn_string += " user=%s" %(self.db_user) if self.db_password: conn_string += " password='%s'" %self.db_password if self.db_host: conn_string += " host=%s" %self.db_host if self.db_port: conn_string += " port=%s" % self.db_port self.connection = Database.connect(conn_string)
  • Mar 31, 2009
    issue 4 (traceback in python 2.4 or earlier: md5 issue) Status changed by thejaswi...@gmail.com   -   Thanks a lot for pointing the bug out. I have fixed it in r31. Your help is very much appreciated and I hope I got your name right in the CONTRIBUTORS.txt.
    Status: Fixed
    Thanks a lot for pointing the bug out. I have fixed it in r31. Your help is very much appreciated and I hope I got your name right in the CONTRIBUTORS.txt.
    Status: Fixed
  • Mar 31, 2009
    r32 (Added yet another contributor) committed by thejaswi...@gmail.com   -   Added yet another contributor
    Added yet another contributor
  • Mar 31, 2009
    r31 (Fixed wrong import under python2.4. Thanks Jose Nazario (jos...) committed by thejaswi...@gmail.com   -   Fixed wrong import under python2.4. Thanks Jose Nazario (jose.monkey.org)
    Fixed wrong import under python2.4. Thanks Jose Nazario (jose.monkey.org)
  • Mar 31, 2009
    issue 3 (Update module to support Mysql) commented on by thejaswi...@gmail.com   -   This issue is partially fixed. The portion that is not fixed has to do with addition of columns to the tables. @dcguan: Django support is available in the form of safe_url_django.
    This issue is partially fixed. The portion that is not fixed has to do with addition of columns to the tables. @dcguan: Django support is available in the form of safe_url_django.
  • Mar 31, 2009
    issue 4 (traceback in python 2.4 or earlier: md5 issue) reported by jose.monkey.org   -   What steps will reproduce the problem? 1. try and do a lookup_by_url using python 2.4 or older 2. you get a traceback: File "/home/markybob/svn/safebrowsing-python/query_lookup/ query_lookup.py", line 43, in lookup_by_url md5_hash_list.append(md5(url_comp).hexdigest()) TypeError: 'module' object is not callable What is the expected output? What do you see instead? a lookup What version of the product are you using? On what operating system? python 2.3 on os x 10.4 Please provide any additional information below. you need to properly import md5 in query_lookup.py: try: from hashlib import md5 except ImportError: # Python2.4 fallback from md5 import md5 note that "from md5 ..." there. to calculate an md5 sum is really md5.md5('payload').hexdigest() that said you have mixed refences to 'md5' in the code between the module and a string. worth auditing the term 'md5' to make sure it's done right.
    What steps will reproduce the problem? 1. try and do a lookup_by_url using python 2.4 or older 2. you get a traceback: File "/home/markybob/svn/safebrowsing-python/query_lookup/ query_lookup.py", line 43, in lookup_by_url md5_hash_list.append(md5(url_comp).hexdigest()) TypeError: 'module' object is not callable What is the expected output? What do you see instead? a lookup What version of the product are you using? On what operating system? python 2.3 on os x 10.4 Please provide any additional information below. you need to properly import md5 in query_lookup.py: try: from hashlib import md5 except ImportError: # Python2.4 fallback from md5 import md5 note that "from md5 ..." there. to calculate an md5 sum is really md5.md5('payload').hexdigest() that said you have mixed refences to 'md5' in the code between the module and a string. worth auditing the term 'md5' to make sure it's done right.
  • Mar 30, 2009
    r30 (Thanks to Da-Chang Guan for providing an idea about other da...) committed by thejaswi...@gmail.com   -   Thanks to Da-Chang Guan for providing an idea about other database support
    Thanks to Da-Chang Guan for providing an idea about other database support
  • Mar 30, 2009
    Usage (Howto use the python library.) Wiki page edited by thejaswi...@gmail.com
  • Mar 30, 2009
    r28 (Removed the directories that weren't removed by git-svn) committed by thejaswi...@gmail.com   -   Removed the directories that weren't removed by git-svn
    Removed the directories that weren't removed by git-svn
  • Mar 30, 2009
    r27 (New set of features added to safebrowsing. Notable among the...) committed by thejaswi...@gmail.com   -   New set of features added to safebrowsing. Notable among them is db support
    New set of features added to safebrowsing. Notable among them is db support
  • Mar 28, 2009
    r26 (Added Max Length as per HTTP 1.1 RFC) committed by thejaswi.puthraya   -   Added Max Length as per HTTP 1.1 RFC
    Added Max Length as per HTTP 1.1 RFC
  • Mar 27, 2009
    issue 2 (Increase length handled by SafeURL Field) Status changed by thejaswi.puthraya   -   Made the change in r24. Issue closed.
    Status: Fixed
    Made the change in r24. Issue closed.
    Status: Fixed
  • Mar 27, 2009
    r25 (Made the code more readable esp the errors) committed by thejaswi.puthraya   -   Made the code more readable esp the errors
    Made the code more readable esp the errors
  • Mar 27, 2009
    r24 (Made the code more readable esp the errors) committed by thejaswi.puthraya   -   Made the code more readable esp the errors
    Made the code more readable esp the errors
  • Mar 27, 2009
    issue 3 (Update module to support Mysql) commented on by dcguan   -   BTW, the Django support is not yet available, I am working on it.
    BTW, the Django support is not yet available, I am working on it.
  • Mar 27, 2009
    issue 3 (Update module to support Mysql) reported by dcguan   -   Hi, the attachment is the package I modified the module to support Mysql. Please read readme.txt for details. For lookup module, I do not close the database after query. Maybe need to add a cleanup function to handle this.
    Hi, the attachment is the package I modified the module to support Mysql. Please read readme.txt for details. For lookup module, I do not close the database after query. Maybe need to add a cleanup function to handle this.

Older

  • Dec 05, 2008
    issue 2 (Increase length handled by SafeURL Field) reported by thejaswi.puthraya   -   Summary says it all. Increase the max_length from 200 to 2048 characters as per Http 1.1 RFC.
    Summary says it all. Increase the max_length from 200 to 2048 characters as per Http 1.1 RFC.
  • Nov 25, 2008
    issue 1 (Incorrect exception in import attempt - query_lookup.py) Status changed by thejaswi.puthraya   -   Thank you for pointing out the issue. I have fixed the error in r22. Thank you Marcel Chastain.
    Status: Fixed
    Thank you for pointing out the issue. I have fixed the error in r22. Thank you Marcel Chastain.
    Status: Fixed
  • Nov 25, 2008
    r23 (Added CONTRIBUTORS.txt file to list all contributors to the ...) committed by thejaswi.puthraya   -   Added CONTRIBUTORS.txt file to list all contributors to the project
    Added CONTRIBUTORS.txt file to list all contributors to the project
  • Nov 25, 2008
    r22 (Fixing Issue 1 wrt a wrong exception being raised. Thank you...) committed by thejaswi.puthraya   -   Fixing Issue 1 wrt a wrong exception being raised. Thank you Marcel Chastain.
    Fixing Issue 1 wrt a wrong exception being raised. Thank you Marcel Chastain.
  • Nov 25, 2008
    r21 (Removed the lower-cased license.txt file) committed by thejaswi.puthraya   -   Removed the lower-cased license.txt file
    Removed the lower-cased license.txt file
  • Nov 25, 2008
    r20 (Renamed the license file to follow certain conventions) committed by thejaswi.puthraya   -   Renamed the license file to follow certain conventions
    Renamed the license file to follow certain conventions
  • Nov 25, 2008
    issue 1 (Incorrect exception in import attempt - query_lookup.py) reported by marcel.chastain   -   What steps will reproduce the problem? 1. download source to a system without sqlite3 2. attempt to import query_lookup module 3. observe the error What is the expected output? What do you see instead? The import should succeed. There is a try/except block that attempts to import sqlite3. If it fails, it should fallback to importing pysqlite2. The 'except IOError' on line 7 is incorrect, and should be 'except ImportError'.
    What steps will reproduce the problem? 1. download source to a system without sqlite3 2. attempt to import query_lookup module 3. observe the error What is the expected output? What do you see instead? The import should succeed. There is a try/except block that attempts to import sqlite3. If it fails, it should fallback to importing pysqlite2. The 'except IOError' on line 7 is incorrect, and should be 'except ImportError'.
  • Oct 21, 2008
    r19 (Refactored query_lookup.py to make it more readable and use ...) committed by thejaswi.puthraya   -   Refactored query_lookup.py to make it more readable and use smaller try blocks
    Refactored query_lookup.py to make it more readable and use smaller try blocks
  • Oct 21, 2008
    r18 (Refactored prepare_db.py to remove lots of redundant code) committed by thejaswi.puthraya   -   Refactored prepare_db.py to remove lots of redundant code
    Refactored prepare_db.py to remove lots of redundant code
  • Oct 21, 2008
    r17 (Added comment regarding lack of field validation for safe_ur...) committed by thejaswi.puthraya   -   Added comment regarding lack of field validation for safe_url_django
    Added comment regarding lack of field validation for safe_url_django
  • Oct 20, 2008
    r16 (Compatible with Django-1.0) committed by thejaswi.puthraya   -   Compatible with Django-1.0
    Compatible with Django-1.0
 
Hosted by Google Code