My favorites | Sign in
Project Home Downloads Wiki Issues Source
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 56: fetched 'remarks' data contains what looks like a buffer-overrun error
2 people starred this issue and may be notified of changes. Back to list
Status:  Accepted
Owner:  ----


Sign in to add a comment
 
Reported by rogermilne@gmail.com, Jun 11, 2009
What steps will reproduce the problem?
-----------------------------------------------
1. Using Access 2007 (or possibly other databases), create a new database 
(use the default *.accdb format), and save it as C:\MyDB.accdb

2. Create a new, empty table called "Table1" in that database.

3. Create a new field named "MyField" in that table.  In the 'Description' 
column of that field, put "THIS IS THE DESCRIPTION OF MY FIELD"

4. Launch python, and try the following code:
>>> import pyodbc
>>> con=pyodbc.connect('DRIVER={Microsoft Access Driver (*.mdb, 
*.accdb)};DBQ=C:\\MyDB.accdb')
>>> cursor = con.cursor()
>>> result = cursor.columns( table='Table1' ).fetchall()
>>> for row in result: print row
...or try this...
>>> for row in result: print row.remarks


What is the expected output? What do you see instead?
-----------------------------------------------
I expect to see the unicode string that I typed into Access, terminated 
correctly.  Instead, I see the following:

u'THIS IS THE DESCRIPTION OF MY FIELD\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\uffff\uffff\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\u0698\xb3\u1930\xb3'

Sometimes this string contains illegal number sequences that cause errors 
such as...
UnicodeEncodeError: 'charmap' codec can't encode characters in position 49-
50: character maps to <undefined>
Sometimes this string contains (near the end) other strings within it, of 
other parts of the database, such as other field names or values.

I also tried connecting to the database using a simple DSN, and the same 
error occurred.

What version of the product are you using? On what operating system?
-----------------------------------------------
(Windows XP, Python 2.5, pyODBC v2.1.6, Microsoft Access 2007)


Please provide any additional information below.
-----------------------------------------------
Please contact rmilne@hellbentgames.com for additional information or help 
in any way, in solving this problem.  If you can point me in the right 
direction, I would most certainly be able to help resolve the problem for 
you or with you (but I've never contributed to a project like this before).

NOTE: I attached a zip of the MyDB.accdb file in hopes that you can access 
it even without Microsoft Access installed.

MyDB.zip
10.9 KB   Download
Aug 7, 2009
Project Member #1 mkleehammer
Good catch.  I've reproduced it, but I'm not sure what to do about it so far.

The Unicode Access unit tests work great.  ODBC returns the correct length; for a 
254 character field, the length is 508 bytes.

When retrieving the remarks field, the value "this is a description" is returned as 
84 bytes, which would mean 42 Unicode characters which is exactly twice as wide as 
the actual value.

It appears to be this Access driver bug: http://support.microsoft.com/kb/q236871

Unfortunately it says both "BUFFER_LENGTH and CHAR_OCTET_LENGTH are 30 when they 
should also be 15" and "This behavior is by design".  I'm not sure how something can 
both be wrong and by design, but it appears it is.

It looks like it always has a terminating NULL, so that's a potential way to fix 
it.  I'm not sure I'm ready to always try to trim strings since it would (1) affect 
performance and (2) break code that expects embedded NULLs.  Perhaps I should set a 
flag when Cursor.columns is used?

In the meantime, you can use something like the following to work around it:

  cleaned = row and row.remarks.split(chr(0), 1)[0] or None

Status: Accepted
Aug 13, 2009
#2 rogermilne@gmail.com
Wow, that's interesting...  Leave it to Microsoft to make a "By Design" bug...   
ugh.  

Thanks for the work-around idea, it's much better than mine (I, of course, thought 
about the problem way too close to the issue, and made a for-loop that pulled 
characters one at a time,rebuilding a new string from it, up to the NULL).  DOH.

I also noticed a second issue on the same field:  I can enter text as long as I want 
into that field in Access, but if I exceed 128 characters, the string is returned 
as "None" or an empty string (can't remember which), in Python.  I don't think that 
one's nearly as work-around'able at this point :-(  I've had to just keep my text 
short and keep people aware of the issue when they add text as well.

Thanks for investigating the bug, I really appreciate it!!

    Roger

Sign in to add a comment

Powered by Google Project Hosting