My favorites | Sign in
Project Home Downloads Wiki
READ-ONLY: This project has been archived. For more information see this post.
Search
for
  Advanced search   Search tips   Subscriptions
Issue 120: Pretty slow data transfer, tinygrams likely the root problem
2 people starred this issue and may be notified of changes. Back to list
Status:  Investigating
Owner:  ----


 
Reported by strom...@gmail.com, Sep 10, 2010
What steps will reproduce the problem?
1. Run a small test program that does just a cursor.fetchall() on a database that's a few hundred megabytes in size
2. Time it
3. It's slow

What is the expected output? What do you see instead?
The result is correct, it's just too slow for me to use for my current project.  I suspect there's a minor misdesign somewhere, because if I strace the process on Ubuntu 10.04, I can see a main loop consisting of many calls to poll() and recv().  The problem is probably that the recv()'s are too plentiful - some code at some layer of the communication should be transmitting a larger amount of data across the network and breaking it up later, to avoid all the network tinygrams I'm seeing.

What version of the product are you using? On what operating system?
I'm using pyodbc 2.1.7 on (Windows XP SP3 with ActivePython 2.6.5 and the "{SQL Server}" driver) and (Ubuntu 10.04 with Python 2.6.5 (the shipped version), unixodbc 2.2.11-21 (the shipped version) and FreeTDS 0.82-6build1 (the shipped version).  Both scenarios are underperforming, I believe.

Please provide any additional information below.
The packets transferred tend to be about 500 some bytes long - I suspect this is approximately the width of a row, plus TCP/IP overhead.

I don't know for a fact that this is a problem specific to pyodbc, but I can't help but wonder, since it seems to be slow on both OS's with quite different drivers.

Checking on XP with wireshark, it does look like I'm getting some full (1500 byte) frames...  I'll probably try jumbo frames there to see if that helps, but the fact that the frames are tiny on Ubuntu remains a likely issue somewhere in the database communication I'm seeing.  And yes, it is a bit slower on Ubuntu than XP.

Thanks for the great software!  If this can't be sorted out, I'll have to go back to pywin32's "odbc" module - not pleasant ^_^


Sep 12, 2010
Project Member #1 mkleehammer
I'm building a test for this and I'd like to make sure I'm seeing something similar to your setup.  How large is each row (sounds like ~500 bytes?), how many rows did you fetch, and how long did it take?

Thanks.
Sep 12, 2010
Project Member #2 mkleehammer
I forgot to mention, please test with 2.1.8 if possible.  There are some changes, one of which is very important if you are selecting more than 400 characters/bytes from a single column.  Previously, anything over 400 or 500 bytes, I forgot which, was considered a long value.  2.1.8 determines the maximum size when the first connection is made.
Status: Investigating
Sep 12, 2010
Project Member #3 mkleehammer
This may be something we want to look at: http://support.microsoft.com/kb/157802

On a local server, I'm selecting 1M rows in 8.4 seconds, each about 500 bytes.
Sep 12, 2010
#4 strom...@gmail.com
I'll make sure I use 2.1.8 in my tests.

As far as the test code: I'm attempting to get permission to release (that part of) the code.
Sep 14, 2010
#5 strom...@gmail.com

In a sense happily, I was already using 2.1.8 on Linux.  However, on Windows, I somehow had 2.1.7.

My rows are, on average, 1638.784 bytes wide (they use lots of varchar and nvarchar - nvarchar being unicode, and that figure is not including indexes).

tsql (the FreeTDS minimalist database client that only uses the TDS protocol) is also using just over 500 bytes per block for most of a similar data transfer.  I'm not sure that really means that this is a FreeTDS problem though, but I don't mean to rule that out either.

I've attached some test code to facilitate replication - with company-identifying details replaced with descriptions of what they should be set to in your environment.

Despite the tinygrams, the performance isn't terrible on Linux - on a 100BaseT link, I'm getting about 59.147 megabits/second from the attached test code, while using my pnetcat program (http://stromberg.dnsalias.org/~strombrg/pnetcat.html - it's in CPython, but doesn't use any database) I'm getting about 93.907 megabits/second, and iperf (which is in C++) is getting about 93.8 megabits/second.  But both pnetcat and iperf are relatively ivory-tower performance.

However, on Windows, the packets look larger (conceivably due to the Nagel algorithm, I don't know about that), but are moving slower (and usually the reverse will be true, as you likely know).  On Windows XP, I appear to be getting 8.227 megabits/second - and I'm afraid on this project, it's Windows that we need most.  Again, this is with 2.1.7.

Installing 2.1.8 on that same XP box, and rerunning my test program...  It now gets 8.212 megabits/second - so it may have gotten slightly slower with the pyodbc upgrade, but that's probably not a statistically significant difference.

BTW, I upgraded the device driver for the NIC in the XP box, but that seemed to help only marginally - probably not by a statistically significant amount.

Thanks for the great software!

PS: I'm using (k_in_table_according_to_sp_spaceused)*1024*8 / (10^6 * seconds_for_transfer) to compute megabits/second.


replication.tar.bz2
2.0 KB   Download
Sep 14, 2010
#6 strom...@gmail.com
BTW, I'm not sure enabling jumbo frames is going to work out - the driver for the NIC in the XP box doesn't appear to allow them, despite the driver upgrade.  Also, a relevant IT guy here was pretty opposed to using them.


Sep 14, 2010
#8 strom...@gmail.com
An iperf from the ODBC client to the ODBC server gets 247 megabits/second

A pnetcat of 256M (being careful to avoid the cygwin python for speed) from same to same gets 356.310 megabits/second.

pnetcat's time is still only about a 3rd of the theoretical performance; jumbo frames would likely help these programs, but not _necessarily_ ODBC.  It is at least clearly gigabit speed though.

On the Windows XP client I invoked pnetcat with (that's cygwin dd, BTW):
dd if=/dev/zero bs=1024k count=256 | \
        c:/Python26/python.exe pnetcat.py -O host.net 5555 -i -w $((2**20)) -N 1 -v

On the Windows 7 server I invoked pnetcat with:
c:/Python26/python.exe -I 5555 -n -w 1048576 -N 1

Sep 14, 2010
#9 strom...@gmail.com
Just to be totally clear: the tinygrams on Linux may not be that related to the underperformance on windows, because windows appears to be giving full length, standard ethernet frames.

However, if the tinygrams can be eliminated on Linux, that might make it even faster.


Sep 14, 2010
#10 strom...@gmail.com

The "odbc" module that comes with "Python for Windows Extensions" (http://python.net/crew/mhammond/win32) is getting 112.026 megabits/second on the same server, database and table - that's a little over a factor of 13 faster.

Sep 14, 2010
#11 strom...@gmail.com
Another correction: time_with("import odbc" on XP) < time_with("import pyodbc" on Ubuntu) < time_with("import pyodbc" on XP)

That is, the first listed is fastest, the last listed is slowest.

Sep 15, 2010
Project Member #12 mkleehammer
The pyodbc vs odbc comparison is probably the best place to start.

Is it possible to produce a small trace of each for your test, perhaps fetching only two rows instead of all of them.  What I'm interested in are (1) what ODBC calls is pyodbc making that might be unnecessary and (2) what data types are there.

For the 2nd point, I might need to look at what the two modules are allocating and how odbc handles Unicode.  If it is Unicode related, I might have to add some #ifdefs specific to 32-bit and 64-bit Windows and optimize those cases to memcpy's.  (One weakness of C++ compilers is the inability to use sizeof() in compile time decisions without resorting to template-specialization, which I really don't think will be portable enough.)

Thanks.
Sep 15, 2010
#13 strom...@gmail.com

Attached to this message is a trace from two rows on Ubuntu 10.04.

Sep 15, 2010
#14 strom...@gmail.com

And here are two traces from Windows XP: One from pyodbc, one from odbc.

Sep 15, 2010
#15 strom...@gmail.com

Please let me know if there's anything further you need to look into this.

Sep 15, 2010
Project Member #16 mkleehammer
OK, I've found a few things:

1) There were two loops calling SQLDescribeCol after a select, which is obviously unnecessary work.  I've eliminated that in a test version here.  I would expect this to be negligible, though you are selecting a lot of columns.

2) Related: If you don't need all of the columns, reducing them is your biggest performance win, even with the odbc module.

3) The odbc module is pre-allocating memory and binding into it, which is faster.  I just didn't expect it to e that much faster.  I'm going to create a performance branch and see what it would take to do something similar.  It might allocate a lot more memory, though.
Sep 15, 2010
#17 strom...@gmail.com

Great news!

Thanks.

Sep 15, 2010
#18 strom...@gmail.com
BTW, we do need lots of columns - about 120 of them.

Sep 15, 2010
#19 strom...@gmail.com
There are some interesting articles on the web about speeding up ODBC data transfers if you google for "ODBC bulk fetch" or "odbc firehose"....

HTH ^_^



Powered by Google Project Hosting