My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
Release2009Q3  
Details on the 2009Q3 release
Featured
Updated Jan 8, 2010 by collinw

Unladen Swallow 2009Q3

Unladen Swallow 2009Q3 is the second release of Unladen Swallow to use LLVM for native code generation, and the first to use runtime feedback for optimization. To obtain the 2009Q3 release, run

svn checkout http://unladen-swallow.googlecode.com/svn/branches/release-2009Q3-maint unladen-2009Q3

The Unladen Swallow team does not recommend wide adoption of the 2009Q3 release. This is intended as a checkpoint of our progress, a milestone on the long path to our eventual performance goals. Note that Unladen Swallow tracks LLVM trunk fairly closely, and will not build against LLVM 2.5 or 2.6.

Highlights:

  • Unladen Swallow 2009Q3 uses up to 930% less memory than the 2009Q2 release.
  • Execution performance has improved by 15-70%, depending on benchmark.
  • Unladen Swallow 2009Q3 integrates with gdb 7.0 to better support debugging of JIT-compiled code.
  • Unladen Swallow 2009Q3 integrates with OProfile 0.9.4 and later to provide seemless profiling across Python and C code, if configured with --with-oprofile=<oprofile-prefix>.
  • Many bugs and restrictions in LLVM's JIT have been fixed. In particular, the 2009Q2 limitation of 16MB of machine code has been lifted.
  • Unladen Swallow 2009Q3 passes the tests for all the third-party tools and libraries listed on the Testing page. Significantly for many projects, this includes compatibility with Twisted, Django, NumPy and Swig.

Lowlights:

  • LLVM's JIT and other infrastructure needed more work than was expected. As a result, we did not have time to improve performance as much as we would have liked.
  • Memory usage is still 2-3x that of Python 2.6.1. However, there is more overhead that can be eliminated for the 2009Q4 release.

Memory Usage

In the memory benchmarks, we compared the fastest configuration for Q3 against the fastest configuration for Q2. The Q2 configuration is the same as what was reported in Release2009Q2. The performance degradation/improvement is calculated using ((new - old) / new). Units are kilobytes.

2009Q2 vs 2009Q3
slowspitfire:
$ ./perf.py -r -b slowspitfire --args "-j always," --track_memory ../q2/python ../q3/python
Mem max: 212344.000 -> 96884.000: 119.17% smaller
Usage over time: http://tinyurl.com/yfy3w3p

ai:
$ ./perf.py -r -b ai --args "-j always," --track_memory ../q2/python ../q3/python
Mem max: 95012.000 -> 14020.000: 577.69% smaller
Usage over time: http://tinyurl.com/yz7v4xj

slowpickle:
$ ./perf.py -r -b slowpickle --args "-j always," --track_memory ../q2/python ../q3/python
Mem max: 96876.000 -> 18996.000: 409.98% smaller
Usage over time: http://tinyurl.com/yf4a3sj

slowunpickle:
$ ./perf.py -r -b slowunpickle --args "-j always," --track_memory ../q2/python ../q3/python
Mem max: 96876.000 -> 14076.000: 588.24% smaller
Usage over time: http://tinyurl.com/yfzv2mn

django:
$ ./perf.py -r -b django --args "-j always," --track_memory ../q2/python ../q3/python
Mem max: 159064.000 -> 27160.000: 485.66% smaller
Usage over time: http://tinyurl.com/ykdmdml

rietveld:
$ ./perf.py -r -b rietveld --args "-j always," --track_memory ../q2/python ../q3/python
Mem max: 575116.000 -> 55952.000: 927.87% smaller
Usage over time: http://tinyurl.com/yf3rcbb

GDB Support

The Unladen Swallow team added support to gdb 7.0 that allow JIT compilers to emit DWARF debugging information so that gdb can function properly in the presence of JIT-compiled code. This interface should be sufficiently generic that any JIT compiler can take advantage of it.

Example backtrace before:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2aaaabdfbd10 (LWP 25476)]
0x00002aaaabe7d1a8 in ?? ()
(gdb) bt
#0  0x00002aaaabe7d1a8 in ?? ()
#1  0x0000000000000003 in ?? ()
#2  0x0000000000000004 in ?? ()
#3  0x00032aaaabe7cfd0 in ?? ()
#4  0x00002aaaabe7d12c in ?? ()
#5  0x00022aaa00000003 in ?? ()
#6  0x00002aaaabe7d0aa in ?? ()
#7  0x01000002abe7cff0 in ?? ()
#8  0x00002aaaabe7d02c in ?? ()
#9  0x0100000000000001 in ?? ()
#10 0x00000000014388e0 in ?? ()
#11 0x00007fff00000001 in ?? ()
#12 0x0000000000b870a2 in llvm::JIT::runFunction (this=0x1405b70,
F=0x14024e0, ArgValues=@0x7fffffffe050)
   at /home/rnk/llvm-gdb/lib/ExecutionEngine/JIT/JIT.cpp:395
#13 0x0000000000baa4c5 in llvm::ExecutionEngine::runFunctionAsMain
(this=0x1405b70, Fn=0x14024e0, argv=@0x13f06f8, envp=0x7fffffffe3b0)
   at /home/rnk/llvm-gdb/lib/ExecutionEngine/ExecutionEngine.cpp:377
#14 0x00000000007ebd52 in main (argc=2, argv=0x7fffffffe398,
envp=0x7fffffffe3b0) at /home/rnk/llvm-gdb/tools/lli/lli.cpp:208

And a backtrace after this patch:

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaabe7d1a8 in baz ()
(gdb) bt
#0  0x00002aaaabe7d1a8 in baz ()
#1  0x00002aaaabe7d12c in bar ()
#2  0x00002aaaabe7d0aa in foo ()
#3  0x00002aaaabe7d02c in main ()
#4  0x0000000000b870a2 in llvm::JIT::runFunction (this=0x1405b70,
F=0x14024e0, ArgValues=...)
   at /home/rnk/llvm-gdb/lib/ExecutionEngine/JIT/JIT.cpp:395
#5  0x0000000000baa4c5 in llvm::ExecutionEngine::runFunctionAsMain
(this=0x1405b70, Fn=0x14024e0, argv=..., envp=0x7fffffffe3c0)
   at /home/rnk/llvm-gdb/lib/ExecutionEngine/ExecutionEngine.cpp:377
#6  0x00000000007ebd52 in main (argc=2, argv=0x7fffffffe3a8,
envp=0x7fffffffe3c0) at /home/rnk/llvm-gdb/tools/lli/lli.cpp:208

So much nicer.

See http://llvm.org/docs/DebuggingJITedCode.html for more details. Thanks to our intern, Reid Kleckner, for doing the heavy lifting on this feature!

Benchmarks

2009Q3 uses a more sophisticated system for determining which functions to compile than did 2009Q2. Accordingly, we no longer use Unladen Swallow's -j always option when benchmarking 2009Q3.

Benchmarking was done on an Intel Core 2 Duo 6600 @ 2.40GHz with 4GB RAM with a 32-bit userspace. The performance degradation/improvement is calculated using ((new - old) / new).

2009Q2 vs 2009Q3

slowspitfire:
$ ./perf.py -r -b slowspitfire --args "-j always," ../q2/python ../q3/python
Min: 0.690717 -> 0.622342: 10.99% faster
Avg: 0.692846 -> 0.624929: 10.87% faster
Significant (t=165.901211, a=0.95)
Stddev: 0.00348 -> 0.00215: 62.23% smaller

ai:
$ ./perf.py -r -b ai --args "-j always," ../q2/python ../q3/python
Min: 0.525973 -> 0.459890: 14.37% faster
Avg: 0.529790 -> 0.464647: 14.02% faster
Significant (t=69.943861, a=0.95)
Stddev: 0.00238 -> 0.00900: 73.55% larger

slowpickle:
$ ./perf.py -r -b slowpickle --args "-j always," ../q2/python ../q3/python
Min: 0.732290 -> 0.597355: 22.59% faster
Avg: 0.733397 -> 0.615644: 19.13% faster
Significant (t=13.096018, a=0.95)
Stddev: 0.00208 -> 0.08989: 97.68% larger

slowunpickle:
$ ./perf.py -r -b slowunpickle --args "-j always," ../q2/python ../q3/python
Min: 0.314137 -> 0.264590: 18.73% faster
Avg: 0.314825 -> 0.276463: 13.88% faster
Significant (t=9.762778, a=0.95)
Stddev: 0.00100 -> 0.03928: 97.45% larger

django:
$ ./perf.py -r -b django --args "-j always," ../q2/python ../q3/python
Min: 1.095181 -> 0.946080: 15.76% faster
Avg: 1.096714 -> 0.949940: 15.45% faster
Significant (t=315.826693, a=0.95)
Stddev: 0.00088 -> 0.00456: 80.82% larger

rietveld:
$ ./perf.py -r -b rietveld --args "-j always," ../q2/python ../q3/python
Min: 0.578493 -> 0.516558: 11.99% faster
Avg: 0.583965 -> 0.619006: 5.66% slower
Significant (t=-2.009135, a=0.95)
Stddev: 0.00804 -> 0.17422: 95.39% larger

call_simple: $ ./perf.py -r -b call_simple --args "-j always," ../q2/python ../q3/python
Min: 1.618273 -> 0.908331: 78.16% faster
Avg: 1.632256 -> 0.924890: 76.48% faster
Significant (t=433.008411, a=0.95)
Stddev: 0.00847 -> 0.01397: 39.38% larger

Comment by kalle.ha...@gmail.com, Nov 11, 2009

Seems like nice progress! Definately an interesting project to follow as a python coder. On the other hand, one could remark about the comparisons. It's hard for memory usage to be over 100% smaller than what it's compared to, but over 100% larger works....

Comment by nix...@gmail.com, Nov 11, 2009

Awesome - keep up the good work!

Comment by volshebnyi, Nov 11, 2009

Really great improvement since Q2, waiting for optimized regex in Q4. Thanks for new backtrace! =)

Comment by doug.far...@gmail.com, Nov 13, 2009

I'm curious if the release of the Go language by Google will impact the Unladden Swallow project, personally, I hope not. :)

Comment by lobais, Nov 17, 2009

#kalle.happonen yes, I thought the same. Probably they mean that Q2 uses 930% more than Q3, giving a reduction of 89%. Still, a very nice accomplishment.

#doug.farrell I doubt that either Go nor Noop will impact Unladden Swallow too much. I think they are both rather experimental, fun projects, whereas the swallow is very serious.

Comment by khame...@gmail.com, Dec 2, 2009

oh, I've totally overlooked the benchmark sections...

BTW, what does it mean "Mem max: 212344.000 -> 96884.000: 119.17% smaller" ?

"119.17% smaller" sounds like negative size :o)

Comment by jabron...@gmail.com, Dec 7, 2009

psyched about your progress, keep up the great work!

Comment by connelly...@gmail.com, Dec 11, 2009

cool! what is the performance relative to the mainline python 2.6 implementation?

Comment by guy.sloa...@gmail.com, Dec 22, 2009

This is great progress. Congratulations.

Comment by flashdes...@gmail.com, Jan 4, 2010

It is impossible to use 930% less of something without resulting in a negative amount. I think they meant that the previous version used up to 930% more memory.

Comment by ng.hong.quan, Jan 7, 2010

Awesome!

Comment by ingemar....@gmail.com, Feb 9, 2010

The percentages are strangely computed. As several other readers have commented, you cannot reduce a value by more than 100%. It would be much better to determine the reduction in terms of the old version, like (new - old) / old. For the "927%" example, the result would become -0.9027 = a reduction of 90.27%.

Comment by leon.mat...@gmail.com, Feb 9, 2010

Remember, a percentage is just a way to express a ratio. The ratio of memory use between the old and new versions as 9.0. It is quite reasonable to express that ratio as 900%.

Comment by eric.koc...@gmail.com, Feb 18, 2010

@leon: Agreed, while you can have a ratio of 9:1 which is 900%, that ratio is comparing the numerator (old version) to the denominator (new version). Google, however, is trying to compare the new to the old.

Let's first agree that an old to new ratio of 1:2 means an increase of %100. I'm using this as an example because you can just look at the ratio and easily see that it increased by %100. We get this by (((new-old) / old) * 100) = percent_change which we can see is the correct formula. If we apply that same formula to the numbers they gave above we get -90.27%.

Looks like Google marketing got their hands on the stats above ("900% is big, that'll sound better than 90%").

Comment by justin.e...@gmail.com, Mar 25, 2010

The use of percentages here and the word smaller is misleading, if not inaccurate. I would recommend leaving the output in ratio form.

Are the changes in standard deviations especially important?

Comment by tianpm...@gmail.com, Apr 2, 2010

好牛啊

Comment by amitnai...@gmail.com, Mar 31, 2011

Can I start career in python. I am php developer with 5 months experience. I am from India.


Sign in to add a comment
Powered by Google Project Hosting