Export to GitHub

unladen-swallow - issue #63

Improve oprofile support


Posted on Jun 22, 2009 by Helpful Cat

OProfile has support for attributing samples to particular functions and lines of code, even when that code is generated by a JIT compiler. This is described at http://oprofile.sourceforge.net/doc/devel/jit-interface.html. (The "agent library" described there is designed to work with http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html. An implementation is at http://oprofile.cvs.sourceforge.net/viewvc/oprofile/oprofile/agents/jvmti/li bjvmti_oprofile.c?view=markup.

I've got a very preliminary patch to do this at http://codereview.appspot.com/79049, but I need to fix that up so I can submit it to mainline LLVM.

Comment #1

Posted on Jun 22, 2009 by Helpful Cat

I'm using the attached file to test oprofile with the JIT.

Attachments

Comment #2

Posted on Jul 13, 2009 by Helpful Elephant

(No comment was entered for this change.)

Comment #3

Posted on Aug 28, 2009 by Helpful Elephant

Jeffrey, can you update the status of this? This is in oprofile, right? Which version?

Comment #4

Posted on Sep 26, 2009 by Helpful Cat

OProfile 0.9.5 should work when you pass --enable-oprofile= to configure. We've merged the relevant patches from LLVM. I haven't tested or documented this in Python yet though.

Comment #5

Posted on Dec 17, 2009 by Helpful Elephant

Issue 99 has been merged into this issue.

Comment #6

Posted on Dec 17, 2009 by Helpful Elephant

We still need a slight patch to Unladen to make the OProfile support work well. We also need to patch up the build system to pass LLVM's full flag set through to embedding apps.

Comment #7

Posted on Jan 6, 2010 by Helpful Elephant

(No comment was entered for this change.)

Comment #8

Posted on Jan 11, 2010 by Helpful Elephant

This was previously blocked on using llvm-config to obtain the necessary flags; that support was added in r981.

Comment #9

Posted on Jan 11, 2010 by Helpful Elephant

(No comment was entered for this change.)

Comment #10

Posted on Jan 12, 2010 by Helpful Elephant

Basic oprofile support was added in r986:

$ sudo opcontrol --reset Signalling daemon... done $ sudo opcontrol --start-daemon; sudo opcontrol --start; PYTHONPATH=../tests/lib/django/ ./python ../tests/performance/bm_django.py -n 100; sudo opcontrol --stop

$ opreport -l ./python | less CPU: Core 2, speed 1600 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % image name symbol name 79589 4.2329 python PyString_FromFormatV 62971 3.3491 python PyEval_EvalCodeEx 62713 3.3354 python tupledealloc 57071 3.0353 python _PyEval_CallFunction 50009 2.6597 24532.jo #u#force_unicode 47468 2.5246 python PyUnicodeUCS2_Decode 45829 2.4374 python PyFrame_New 45173 2.4025 python lookdict_string 43082 2.2913 python PyType_IsSubtype 39763 2.1148 24532.jo #u#render5 38145 2.0287 python _PyType_Lookup 37643 2.0020 python PyObject_GC_UnTrack 37105 1.9734 python frame_dealloc 36849 1.9598 python PyEval_EvalFrame 35630 1.8950 24532.jo #u#resolve 33313 1.7717 python PyObject_IsInstance 33208 1.7662 python PyDict_GetItem 33168 1.7640 python PyTuple_New 30458 1.6199 python PyCFunction_NewEx 29769 1.5833 python PyObject_Malloc 29710 1.5801 python PyString_FromString 27589 1.4673 python PyObject_Call 27266 1.4501 python vgetargskeywords 26578 1.4135 python PyObject_GenericGetAttr 24585 1.3075 python meth_dealloc 22249 1.1833 python _PyString_Eq 22123 1.1766 python PyObject_CallFunctionObjArgs 22034 1.1719 python PyCFunction_Call 20037 1.0657 python convertsimple 19574 1.0410 24532.jo #u#render2 17984 0.9565 python type_getattro 17503 0.9309 python PyObject_Free 17117 0.9104 python PyObject_GetAttr 17031 0.9058 python recursive_isinstance 16243 0.8639 python PyUnicodeUCS2_DecodeASCII 15946 0.8481 24532.jo #u#escape3 15675 0.8337 python string_hash 15428 0.8205 24532.jo #u#mark_safe 14068 0.7482 python _PyEval_CallFunctionVarKw 13938 0.7413 24532.jo #u#_resolve_lookup 13541 0.7202 24532.jo #u#_dec 13477 0.7168 python listiter_next 12844 0.6831 24532.jo #u#mark_for_escaping 12046 0.6407 24532.jo #u#resolve1 10972 0.5835 python getset_get 9731 0.5175 24532.jo #u#render4 9385 0.4991 24532.jo #u#wrapper 9285 0.4938 python
mark_called_and_maybe_compile(PyCodeObject*, _frame*) 8919 0.4744 python PyImport_ImportModuleLevel 8896 0.4731 python load_next 8587 0.4567 python builtin_isinstance 8515 0.4529 python PyMethod_New 8318 0.4424 python _PyUnicode_New 8228 0.4376 python function_call 8124 0.4321 python instancemethod_dealloc 8085 0.4300 24532.jo #u#escape 7913 0.4209 python PyBool_FromLong

Comment #11

Posted on Jan 12, 2010 by Helpful Elephant

Now that we have the bare-bones oprofile support, we should make it better:

  • We need more meaningful function names: #u#render4 does not cut it. Module name and class name should probably figure in here.
  • We should support opreport's -g option. It currently prints

$ opreport -g -l ./python | head -n 15 Overflow stats not available CPU: Core 2, speed 1600 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % linenr info image name symbol name 77638 4.1247 stringobject.c:164 python PyString_FromFormatV 66563 3.5363 eval.cc:3078 python PyEval_EvalCodeEx 63566 3.3771 tupleobject.c:162 python tupledealloc 57153 3.0364 eval.cc:4237 python _PyEval_CallFunction 49921 2.6522 (no location information) 1320.jo #u#force_unicode 47317 2.5138 unicodeobject.c:1127 python PyUnicodeUCS2_Decode 42718 2.2695 dictobject.c:392 python lookdict_string 42197 2.2418 frameobject.c:581 python PyFrame_New 41919 2.2271 typeobject.c:1163 python PyType_IsSubtype 40159 2.1336 frameobject.c:408 python frame_dealloc 39234 2.0844 (no location information) 1320.jo #u#render5 38034 2.0207 gcmodule.c:1390 python PyObject_GC_UnTrack

We should turn those "(no location information)" strings into module names and line numbers.

  • Find a way to support cumulative timings, ideally via oprofile or, if necessary, some other profiling tool.

And of course, all this needs to be documented in the wiki.

Comment #12

Posted on Jan 12, 2010 by Swift Camel

To recap what I said over email in the bug tracker:

There is the module attribute hanging off of the function object, but that isn't the same as the code object. There could conceivably be two function objects with different values for func_module using the same code object, and we don't even have access to the function object from the code object.

There are also the co_filename and co_firstlineno attributes of code objects, which would at least be useful for tracking down the original definition. I think it might be best to take the basename of the filename, the line number, and the function name, and join them with colons and make that the new function name.

That basename(co_filename) : co_firstlineno info sounds like it would be exactly what we want to use for the line number information.

Comment #13

Posted on Jan 12, 2010 by Helpful Elephant

Possibly useful tool to mention in profiling guide: http://code.google.com/p/jrfonseca/wiki/Gprof2Dot

Comment #14

Posted on Jan 13, 2010 by Happy Horse

yep, Gprof2Dot is great :)

Comment #15

Posted on Feb 1, 2010 by Helpful Cat

Turns out that I put debug information behind python -g and then forgot about it. When you pass that while profiling, you get filenames and line numbers. Oops.

I've added a wiki page at http://code.google.com/p/unladen-swallow/wiki/UsingOProfile describing how to use OProfile with Unladen.

Comment #16

Posted on Feb 21, 2010 by Helpful Elephant

The -g option was removed in r1073; it is no longer required for oProfile integration.

Remaining issues: - Improve the names shown in the oProfile output. Currently these look like "#u#render5", which sucks. We should drop the "#u#" prefix down to just "#" (to make them not collide with C functions). We should also include the class name for methods; that "render5" name is actually the fifth Python function named "render". These should be "#MyClass.render" instead.

This should be low-hanging fruit for a Linux user with oProfile installed. Remember to configure LLVM with --with-oprofile=$PREFIX.

Comment #17

Posted on Feb 21, 2010 by Helpful Elephant

(No comment was entered for this change.)

Comment #18

Posted on Feb 23, 2010 by Helpful Giraffe

An initial attempt at improving the names is here: http://codereview.appspot.com/217105/show

Comment #19

Posted on May 21, 2010 by Helpful Elephant

r1157 simplified JIT function names from #u#foo to #foo. Thanks for the patch, Elliot!

Status: Accepted

Labels:
Type-Enhancement Instrumentation OpSys-Linux Release-Merger Priority-Medium StarterProject PyCon