New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve performance of pointer scan for leak reachability analysis #151
Comments
From bruen...@google.com on January 12, 2012 11:37:45 wow looks like I pasted too much stuff into the initial entry text above xref issue #568 (parallelize leak scan) ** INFO breakdown of -leaks_only -no_zero_stack on ui_tests this is NPAPITesterBase.NPObjectProxy Process Name 64-bit Timer samples CS:EIP Symbol + Offset Timer samples CS:EIP Symbol + Offset Timer samples CS:EIP Symbol + Offset Timer samples CS:EIP Symbol + Offset Timer samples analysis:
parallelizing seems most promising for large enough apps: no Owner: bruen...@google.com |
From bruen...@google.com on January 13, 2012 07:25:18 running entire unit_tests suite (minus tools/valgrind exclusions): getting the average in mins: 25.7482 -no_check_uninitialized I wonder how many sub-processes there are, for the scan to take that long: bears investigation whether that's all on the final scan. Suite log dir shows 9 sets of logdirs which makes sense since 3x3 runs. |
From bruen...@google.com on January 13, 2012 07:33:39 leak overhead %scan %rest |
From bruen...@google.com on January 17, 2012 08:25:38 more runs over weekend on windesk for some reason has |
From bruen...@google.com on January 17, 2012 08:30:15 I ran like so: for j in "-no_check_uninitialized" "-no_check_uninitialized -no_leak_scan" "-no_check_uninitialized -no_count_leaks" "-no_check_uninitialized -no_count_leaks -no_use_symcache" "-leaks_only" "-leaks_only -no_zero_stack" "-leaks_only -no_count_leaks" "-leaks_only -no_count_leaks -no_track_allocs"; do |
From bruen...@google.com on December 12, 2012 09:30:57 xref issue #1096 |
From derek.br...@gmail.com on December 10, 2010 17:58:04
PR 475518
my initial implementation of scanning for pointers for reachability analysis produces a noticeable 1-second
delay at app exit (and at nudge time for PR 428709). this case covers trying to speed that up. it's not
identifying the defined region that's slow, it's the pointer scan itself.
adding to this case a FIXME from my code that may improve perf noticeably:
skipping regions that have been read-only since being loaded (.text,
.rodata, etc.). we'll need extra tracking since today we can't tell
whether they were made writable and then changed back to read-only.
we could also skip writable regions that have not been written to since
being loaded since they can't point into the heap. technically
the app could write a heap pointer to a file and then mmap it in but we'll
ignore that.
my initial imp is going to skip r-x regions. I'm not going to skip all non-writable
though. this case covers adding modified-since-load info to
both avoid false neg w/ r-x and to skip more non-writable.
it does take substantial time on large apps.
we should probably fix up -no_count_leaks and document it, and perhaps make it default so that
nudges can be used for just updating the error summary w/o waiting 8 minutes?
just turning it off today => bugs: maybe somebody else needs
op_record_allocs in alloc.c. we still record for pre_us.
for PR 485354 I have the leak scan on windows skipping read-only image regions: once this case checks for history that can be done for *nix too
Server: perforce-panda.eng.vmware.com:1985
PR 520916: leak-check-only mode without losing accuracy
disabling everything except malloc instrumentation.
it could be more thorough (e.g., don't need known_table, etc.) but
should prove useful. this is what my original -leaks_only experiment
used so I kept the code under -no_shadowing when I added stack zeroing.
for more than just options.shadowing: also for PR 536878 where
drheapstat will need fork-following
I needed PR 536058 in DR (diff sent earlier, included in this tree since
needed to test)
adapting drmem's adjust_esp shadow code
using shadow info for regular drmem, I'm brute-force looking up every
8 bytes: but only on Windows since I'm assuming glibc malloc never
unmaps an arena that contains live mallocs. xref PR 535568's attempt
to replace the malloc table w/ an interval tree: too expensive!
PR 485354: # possible leaks nondet on nudge test on Windows
My leaks-only test hit PR 485354 so I investigated:
PR 475518 covers doing that properly and on both plaforms by
monitoring history
The beyond-TOS + zero-on-stack-alloc seems to be working in practice as far
as I can tell (apps aren't exactly deterministic but results
seem to match pretty well w/ full shadowing). Zeroing is not 100%
transparent but it's close enough for me. We'll need more data on more
apps but I think it's going to work out, and the overhead should be a
strict subset of my original shadow-writes proposal. Unfortunately the
overhead is still significant: 2.2x on crafty (vs 1.6x for DrHeapstat).
xref PR 536878: add leak checking feature to Dr. Heapstat a la Dr. Memory
-leaks_only
future work under PR 539395: improve accuracy of -leaks_only:
could intercept signal + sigreturn.
some other way to improve accuracy and efficiency of locating live
mallocs in an unmapped arena
Original issue: http://code.google.com/p/drmemory/issues/detail?id=151
The text was updated successfully, but these errors were encountered: