Issue 78: UTF-16 in memory strings in results
Status:  Complete
Owner: ----
Closed:  Sep 2010
Reported by martijn....@gmail.com, Nov 12, 2009
What steps will reproduce the problem?
1 Setup odbc correctly.
2 Execute the following:
 import pyodbc
 database = pyodbc.connect('DSN=localhost')
 cursor = database.cursor()
 cursor.execute('SHOW CREATE TABLE mysql.users')
 print cursor.fetchall()
 cursor.close()
 database.close()
2 Notice the output being in UTF-16 in memory format, containing something
like this:
 [(u'\U00730075\U00720065', u'\U00520043\U00410045\U00450054\U00540020 ...

What is the expected output? What do you see instead?
Something readable in UTF-8, containing something like this:
...
`Host` char(60) COLLATE utf8_bin NOT NULL DEFAULT '',
`User` char(16) COLLATE utf8_bin NOT NULL DEFAULT '',
...

What version of the product are you using? On what operating system?
pyodbc-2.1.5-2.fc11.i586
unixODBC-2.2.14-2.fc11.i586
mysql-5.1.37-1.fc11.i586
mysql-libs-5.1.37-1.fc11.i586
mysql-server-5.1.37-1.fc11.i586
mysql-connector-odbc-5.1.5r1144-4.fc11.i586
Dec 31, 2009
Project Member #1 mkleehammer
Try adding CHARSET=UTF8 to your connection string:

  cnxn = pyodbc.connect('DSN=localhost;CHARSET=UTF8')

Status: Investigating
Jun 22, 2010
#3 danerick...@gmail.com
I've had the same issue, which only seems to happen on 64-bit machines.

Here's my setup:
ubuntu 10.4 64-bit
pyodbc-2.1.7
unixODBC-2.2.11-21
freeTDS-0.82

Server:
MS SQL Server 2008

I've traced the problem back to a call to SQLGetData in getdata.cpp. The function doesn't seem to handle unicode strings correcty on 64-bit systems. Specifying a charset in the pyodbc.connect() call doesn't fix it. I was able to fix it by mapping all unicode character types to SQL_C_CHAR, I'm attaching a diff of the fix.

Daniel Erickson
Concentric Sky, Inc.
www.concentricsky.com
wchar-fix.diff
412 bytes   View   Download
Sep 6, 2010
Project Member #4 mkleehammer
Fixed in 2.1.8.

Thanks.

Status: Fixed
Nov 21, 2010
Project Member #5 mkleehammer
(No comment was entered for this change.)
Status: Complete
May 7, 2014
#6 kesten.b...@gmail.com
Has there been a regression of this patch?
I have pyodbc 3.0.7 and it appears to not have the patch by danerick applied.

I was affected by the problem described here on a 64bit mac (maverics) pulling data from vertica 6.1.2 using unixodbc 2.3.2 

I can confirm that the patch worked nicely.


Jul 28, 2014
#7 todd.ko...@gmail.com
I'm running pyodbc 3.0.7, unidodbc 2.3.0, vertica 7.1.0 and a 64-bit mac w/mavericks and it looks like the patch was reverted and putting it back made everything better.

-Todd
Sep 29, 2014
#9 mverri...@gmail.com
I'm having this same problem using pyodbc 3.0.7, unixodbc 2.3.2, vertica 6.1.3 on SunOS 5.10.  Querying from isql seems to work fine.  Querying the same DSN using pyodbc results in 2 byte unicode returns just as the OP shows.  I did try setting CHARSET with no luck. 

I think maybe this issue should be reopened.
Oct 14, 2014
#10 RyanCaco...@gmail.com
I'm having the same issue, with vertica. Please reopen and reapply patch, it's the only thing that fixes it.

My issues have been detailed here https://community.vertica.com/vertica/topics/-unrecognized-icu-conversion-error-using-pyodbc?topic-reply-list%5Bsettings%5D%5Bfilter_by%5D=all&topic-reply-list%5Bsettings%5D%5Bpage%5D=1#reply_14887740
Jan 20, 2015
#11 adrian...@gmail.com
Change is on line 322 in 3.0.7.

After many hours of debugging this was also the only fix that got my issues fixed.
See this for related info:
https://github.com/lionheart/django-pyodbc/issues/35