My favorites | Sign in
Project Home Downloads Wiki
READ-ONLY: This project has been archived. For more information see this post.
Search
for
  Advanced search   Search tips   Subscriptions
Issue 296: Missing rows in select when fetching a long string (XML field)
4 people starred this issue and may be notified of changes. Back to list
Status:  New
Owner:  ----


 
Reported by and...@jmunch.dk, Nov 1, 2012
What steps will reproduce the problem?
What is the expected output? What do you see instead?
There's a select from an MS SQL Server database that is returning no rows when I include an XML field. 
For example, if I do:
   curs.execute("select Starttidspunkt from Koersel where SerieNr='12485001'")
then I get 1 row with a timestamp value.
But if I do:
   curs.execute("select CAST(Pdata AS VARBINARY(max)) from Koersel where SerieNr='12485001'")
then I get 0 rows, despite using the exact same where-criteria. The Pdata field is an XML field and the cast turns it into some 30KB of text.

What version of the product are you using? On what operating system?
3.0.6 on XP,Win7.

Please provide any additional information below.
Studying the 3.0.6 sources, I think I've found the cause: The "return 0" at the end of the GetDataString function in getdata.cpp.

The issue is with the handling of indeterminate length text. It looks like GetDataString starts out with a 1024 bytes buffer, and then loops up to 10 times, adding 2048 bytes each time. The last allocation isn't used, so that makes the effective maximum size 19KiB.  The for loop exhausts, and the "return 0" at the end convinces Cursor_fetchlist to break, but since there is no exception raised, this results in rows missing instead of an error message.

Attached a patch to use exponential instead of linear growth of the buffer, allowing for strings of up to 2MiB, and an explanatory MemoryError for when even that's not enough.  

It's untested, alas, since I can't build pyodbc (seems to be MSVC++-only on MSWin). I'd be very grateful for a build with this patch included ASAP.

pyodbc-select-long-strings.patch
1.5 KB   View   Download
Apr 22, 2013
#4 pe...@psantoro.net
I have also hit this issue and I've attached an updated/untested patch which increases the GetDataString default buffer size to 4kB and changes its reallocation algorithm.  Optionally, the GetDataString default buffer size can be changed via the new pyodbc connect keyword getdata_str_size.  The supported range for the getdata_str_size keyword is 1kB - 128kB and the patch enforces this range.  If needed, the GetDataString buffer will be reallocated by doubling (up to 10 times)
the previously used buffer size.  The new reallocation algorithm allows the GetDataString default buffer size to grow to a maximum of 4kB * 2^10 or 4MB.  In summary, this patch allows the pyodbc user to control the maximum GetDataString buffer size needed for their applications (i.e. within the range of 1MB-128MB).

I will endeavor to start testing this patch in the near future.

I've also attached Windows binaries for python 2.7 and python 3.3, if anyone wants to help test this patch.
xmlcol.patch
2.9 KB   View   Download
pyodbc-3.0.6.win32-py3.3.exe
226 KB   Download
pyodbc-3.0.6.win32-py2.7.exe
231 KB   Download
Apr 24, 2013
#5 pe...@psantoro.net
I've attached an updated/tested patch which adds a few TRACE statements and renames a local variable.  I tested this patch on Windows XP/SP3 with Python 2.74 and Python 3.3.1.  Here is an updated summary of this patch:

This patch increases the GetDataString default buffer size/increment to 4kB and changes its reallocation algorithm.  Optionally, the GetDataString default buffer size/increment can be changed via the new pyodbc connect keyword getdata_str_size.  The supported range for the getdata_str_size keyword is 1kB - 128kB and the patch enforces this range.  If needed, the GetDataString buffer will be reallocated by doubling (up to 10 times) the previously used buffer size increment.  In summary, this patch allows the pyodbc user to control the maximum GetDataString buffer size needed for their applications within the range of 2047kB - 262016kB.

I've also attached updated Windows binaries for Python 2.7 and Python 3.3, if anyone wants to further test this patch.
xmlcol.patch
3.2 KB   View   Download
pyodbc-3.0.6.win32-py3.3.exe
226 KB   Download
pyodbc-3.0.6.win32-py2.7.exe
231 KB   Download
Jul 17, 2013
#6 pe...@psantoro.net
I've updated my patch for the pyodbc 3.0.7 release.  I've also included a few binaries for anyone who might need them.  The binaries were built using Python 2.7.5 and Python 3.3.2.
pyodbc-3.0.7.win32-py2.7.exe
232 KB   Download
pyodbc-3.0.7.win32-py3.3.exe
227 KB   Download
pyodbc-3.0.7.win-amd64-py3.3.exe
262 KB   Download
getdata_str_size.patch
3.3 KB   View   Download
Jan 24, 2015
#8 lamba...@gmail.com
I was having the same issue. Specifically this was a problem using Native Client 11 and not Native Client 10. I've gone back to using Native Client 10 as a work around but this patch also fixes the problem.



Powered by Google Project Hosting