Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatically retrieve symbols from MS symbol server #143

Open
derekbruening opened this issue Nov 28, 2014 · 12 comments
Open

automatically retrieve symbols from MS symbol server #143

derekbruening opened this issue Nov 28, 2014 · 12 comments

Comments

@derekbruening
Copy link
Contributor

From derek.br...@gmail.com on December 10, 2010 17:57:59

PR 463897

several features to improve winsyms.c, the Windows addr2line app that uses
dbghelp.dll from PR 463895:

  • until we have full paths from DR (issue [windows] invoke postprocess.pl for child processes #138):
    have DrMem log all .dll files opened and then add those dirs to search path?
  • if we pass in a handle to target process (have a "pid=" input), this will
    load symbols for all its modules, so we could avoid having to load them
    one by one or to guess their paths. but still have to handle modules
    loaded later. and no guarantee process still alive by time sideline
    symbol processor accesses it.
  • support symbol servers, so users can use downloaded Windows system pdbs
  • be more robust about handling failures packing in loaded modules. E.g.,
    today we will probably fail if passed two .exe's (non-relocatable). we
    should query the address space to find more appropriate places to load,
    and potentially unload existing modules (they'll be re-loaded on demand
    later).
  • : have winsyms take in a query and returns whether just has
    export symbols?

Original issue: http://code.google.com/p/drmemory/issues/detail?id=143

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on January 25, 2011 14:49:02

for symbol servers:

  • should have a downstream store to cache files, and some user control over where to store it and a max size

Summary: [windows] improve winsyms.c and drsyms container case

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on April 26, 2011 07:49:19

to get symbol server support working in-process, need to hook priv kernel32's GetModuleFileName. note that the symsrv dll goes and loads over 45 libraries so maybe we don't really want it in-process: xref sideline options (DRi#44). could also have a separate step that runs a separate process to go and cache symbols for all the main libs on the system, along w/ any in recent results.txt files, and have frontend run it.

xref issue #290 : should perhaps put in sym+offset so we can tell how wrong the callstacks are.

the differences between having private syms and not makes it difficult to have default suppressions.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on April 26, 2011 07:54:04

I should mention that for now, locally, I've been retrieving private syms w/ windbg (need them for debugging anyway) and then pointing Dr. Memory at them via _NT_SYMBOL_PATH

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on April 26, 2011 08:15:12

Owner: bruen...@google.com

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on May 06, 2011 13:01:38

we may want symbol store features for syscall handling: issue #388

@derekbruening
Copy link
Contributor Author

From timurrrr@google.com on May 26, 2011 13:47:53

One of the possible solutions:

  • Create a list of DLLs we need at the startup (e.g. ntdll, user32, etc)
    -> load PDBs on start if they are not present -> this is a one-time operation.
    Sometimes you'll need to fetch stuff one more time after a system update but still this is very rare.
    In case of the first run or hash/version mismatch we may require to connect to the internet.
    WDYT?

The remaining DLLs are only needed for the reports/suppression matching and both things can be done at the end/postprocess.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on May 26, 2011 23:01:57

symbols are needed for more than error reports: static CRT is not uncommon on Windows and symbols are needed to find malloc & co. for such modules

Dr. Memory originally used a postprocess model, and still does on Linux and cygwin. It is messy wrt child processes. Much cleaner to have online symbol access, except when the address space fills up, when sideline seems the best way to go. Sideline can also accommodate "online" symsrv.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on August 13, 2012 20:51:42

now that we have the drsyms exension as part of DR, some of the features here are on the DR tracker. packing modules is DRi#449. symbol store support is DRi#450.

so I'm limiting this issue to covering just the part about having Dr. Memory automatically retrieve symbols from the MS symbol server.

Summary: automatically retrieve symbols from MS symbol server
Owner: ---
Labels: -Priority-Low Priority-High

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on August 13, 2012 21:19:51

/MDd support is what's driving this. pasting in my notes:

** TODO (issue #143): auto-download MS symbols for msvc* and system libs

xref issue #143 and issue #388 which have some discussion of auto-downloading
symbols.

for part A and part C we would like msvcrt and msvcpt symbols for simplest
solution

xref DRi#450: drsyms windows symbol store support
symsrv.dll is redistributable across all versions of DTW.
but can't use it easily inside drmemorylib.
so, if drmemorylib sees msvc*.dll and it can't find the syms it needs (or
just do pdb presence query), it gives a fatal error and aborts w/ message
to manually run a command.
that command is also run at installation.
frontend sets _NT_SYMBOL_PATH env var to point at logs/symbols or sthg?

but, we can use hacky solutions for part A, and for part C we can just
disable mismatches: so we can make the lack of symbols non-fatal which
seems good. so instead of fatal error, have warning about false negatives,
and have drmem write a file or sthg which frontend checks on next run (so
user doesn't have to run manual cmd, and frontend doesn't have to check
dlls vs pdbs on every run: wait for drmemorylib to identify problem).
also works for both interactive users and bots.
so it's seeming a frontend option to get syms isn't needed, but could have
it for manual sym download (could run at install time)

once we have the symsrv in place we can use it to get system lib syms which
will help quality of callstacks in general, though IMHO it's still best to
have default suppressions work w/o symbols (just like for above: better w/
syms as in fewer false neg or clearer error reports, but syms not required
to avoid false pos)

OTOH, implementing the issue #607 hack solns requires extra code at runtime: is that
really better than extra time in frontend? though we already have retaddr
and libc bounds are cached, so runtime cost should be much smaller than
frontend opening up files.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on August 14, 2012 07:42:28

forgot to include these:

*** TODO create symsrv.yes file to avoid license prompt
else dialog box pops up (and then stores its own symsrv.yes file)
*** TODO add symsrv.dll usage to frontend, or separate tool launched by frontend
*** TODO can later versions of symchk be redistributed? => yes and no
in 6.3, symchk is not on the list. but in 6.12, it is: yet it imports from
DTW's dbgeng.dll, which is NOT on the list! SymbolCheck.dll is on
the list.
*** TODO make sure whatever DTW binaries are distributed work on 2K and XP
so older versions probably better, unless there are features or bug fixes
we need: which seems unlikely

@derekbruening
Copy link
Contributor Author

From rnk@google.com on August 15, 2012 14:39:22

I experimented a bit with dbghelp and symsrv. I wrote cmake code to copy symsrv into the build dir and package it, but I haven't yet tested that I can ship that package to a clean system and run with it.

The SymSrv* API is a little difficult to use. I think I can figure out the calls that I need to make to fetch symbols, but it might be fragile. I'm thinking that the most robust way to fetch symbols is to call SymSetSearchPath (or set _NT_SYMBOL_PATH in the environment) and then use SymLoadModule64 to fetch the symbols. This approach is very simple, but it's relying on a non-obvious side effect.

How should we communicate the list of modules without pdbs to the frontend? From comment 9, it sounds like we're leaning towards fetching all modules that didn't have pdbs at the end of execution by default. I propose that we store these paths in the a separate file in the same directory. The frontend can compute the path for this file from the results.txt path that it currently retrieves. It will then iterate this file, which should be short, and fetch syms for each module.

Owner: rnk@google.com

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on August 16, 2012 11:36:54

I'm thinking that the most robust way to fetch symbols is to call SymSetSearchPath (or set _NT_SYMBOL_PATH in the environment) and then use SymLoadModule64 to fetch the symbols.

this is what I was trying before when I hit the private loader issues, so I've never directly looked at the symsrv API's

interestingly, while the 6.3 symchk uses symsrv.dll, the 6.12 version does NOT

the rest of the plan sounds good. we should pop up the notepad w/ results.txt before the symbol download. for those on a console perhaps a message about loading so they know what it's doing? could potentially fork off a process to do it while giving them their prompt back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant