New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: amd64 vdso parser issues #8197
Labels
Milestone
Comments
Candidate workaround. This applies to this Linux revision: commit 4d048b0255e3dd4fb001c5f1f609fb67463d04d6 Author: H. Peter Anvin <hpa@zytor.com> Date: Tue Jun 10 14:25:26 2014 -0700 x86, vdso: Remove one final use of htole16() I'm not sure if it's in Linus' tree yet, but it's in the x86/vdso branch of tip.git. This will regress performance for all Go programs that use the broken vDSO parser. How big of a deal is this? |
...now with attachment. Attachments: |
Also, see this: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=vdso/cleanups&id=87f62cc3db04047f9edbf7bacac059396d7112f1 and its parent for an updated reference parser. These changes make it work for 32-bit userspace. |
CL https://golang.org/cl/101260044 mentions this issue. |
It's just a C compiler. It doesn't zero local variables. However, the function is called very early on, in a program that starts with a zeroed stack. So the content is not random garbage. It is the data left over from runtime.check's variables, which are constants. That might help things a little or at least explain why this hasn't been flaky. |
If you have a better idea for how the kernel could provide this service more simply, I'm all ears. Note that it's a huge speedup: it's something like 21ns vs 100ns. CLOCK_MONOTONIC_COARSE is even more dramatic. And on 32-bit systems, you need the same code to avoid using int $0x80 for all syscalls, which is remarkably slow. We used to provide a service by which you could ask for the time by calling 0xffffffffff600000, but that was a really juicy target for exploits. Note that Go is a bit unusual here: basically everything else already has a perfectly function ELF loader that can handle this. And I did provide a nice CC0-licensed piece of reference code that worked fine, modulo Go-specific stack size issues. |
Basically you want the kernel to provide a mapping for a small number of magic symbols to addresses that can be called at runtime. In other words, you want to map a small number of indexes to addresses. I can think of many different ways to handle that in the kernel. I don't think the first mechanism I would reach for would be for the kernel to create an in-memory shared library. It's kind of a baroque mechanism for implementing a simple table. It's true that dynamically linked programs can use the ELF loader. But the ELF loader needed special changes to support VDSOs. And so did gdb. And this approach doesn't help statically linked programs much. And glibc functions needed to be changed anyhow to be aware of the VDSO symbols. So as far as I can tell, all of this complexity really didn't get anything for free. It just wound up being complex. All just my opinion, of course. |
Fair enough. If I'd done this from scratch, I might not have used an ELF file. OTOH, debuggers probably do want a real ELF image: they can look at the DWARF call-frame information, and, if they're really fancy, they can even find the debug symbols that we helpfully install (sometimes) in /lib/modules/KERNELVER/vdso using the ELF build-id mechanism. So it's not that bad. The actual cause of the 3.16 issue is that I'm trying to clean this stuff up. It was a giant mess two years ago, and it's gradually getting better -- at least we're converging on one way of doing this instead of the giant mess that used to exist. |
*not an actual proposal, just thoughts* I guess these parser issues will soon become the source of mysterious crashes of Go servers running on the newer kernels and will potentially affect many people. Would it be possible for the kernel to detect the bogus programs before they are executed, and don't provide VDSO for them? In this case, they will run slower, but at least they won't crash. If implemented, it would make sense to keep this compatibility hack for a couple of kernel releases (say, until 3.18), until most of the actively-being-worked-on Go programs are recompiled, and then just drop the hack. * again, that's not a well-thought proposal and I don't even know, if such behavior is appropriate for the kernel even on a temporary basis. Also, I don't see an easy marker to detect the binaries affected by the bug * |
I've send this patch: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=vdso/cleanups the idea is that the kernel will provide a fake section table that contains exactly this: const __visible struct elf64_shdr vdso_fake_sections[] = { { .sh_type = SHT_DYNSYM, .sh_entsize = sizeof(Elf64_Sym), } }; Old Go programs will see it, initialize num_sym to zero, and not find any symbols. The general kernel policy is to keep these things around for a very long time, if not forever. Sometimes they get hidden behind a configuration option if they become a maintenance burden (e.g. CONFIG_COMPAT_VDSO, which used to be a giant hack to support a particular broken glibc minor version, and now just disables the vDSO if selected). |
Shameless plug: if you want a really easy way to do this, give virtme ([1] or [2]) a try: [1] https://git.kernel.org/cgit/utils/kernel/virtme/virtme.git/ [2] https://github.com/amluto/virtme I've been using virtme-run --kimg arch/x86/boot/bzImage --console to test this stuff. Would it be possible to remove the vsyscall fallback part? Admittedly, getting everything (i.e. Go, glibc, etc) to get rid of the vsyscall fallbacks may be a lost cause, but these days falling back to ordinary syscalls will be much faster than falling back to the vsyscall page. Also, I'm not 100% convinced by your symbol version checking code. I probably need to remind myself exactly how this is supposed to work. The version checking code in my parser is much more complicated, and I assume there's a reason I did that. |
I don't really want to have this conversation again in 2 years. Can we just import the Linux code please? Like https://golang.org/cl/105950043 (untested). |
The static vdso_info variable makes me sad. Anyhow, this is standard ELF dynamic loading code. It doesn't have to be magic. Where we went wrong last time was using the standard symbol table rather than sticking to dynamic linker info. That was probably my fault, and I'm certainly sorry now. |
I tweaked the version code so that an unknown version does not match. I filed issue #8200 to stop using vsyscalls. |
This issue was closed by revision 1db4c8d. Status changed to Fixed. |
Hi Andy, I am now have the system setup for testing, but I realized that the branch you referred to is now deleted: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=vdso/cleanups What's the new one to try? |
Nevermind. I found torvalds/linux@c728762 Will test. |
For the record: 3.16 with the vdso kernel hack works fine with both old and new Go binaries. root@(none):/home/krasin# ./all.sh Running on the system: Linux (none) 3.16.0-rc1 #3 SMP Mon Jun 16 15:08:12 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux Before fix Hello, it's 2014-06-17 00:04:18.507931 -0700 PDT Go runtime: go1.2.2 After fix Hello, it's 2014-06-17 00:04:18.684539441 -0700 PDT Go runtime: devel +2dc2bf98e6e3 Mon Jun 16 16:32:47 2014 -0700 Also, it works with no issues on older kernels: ./all.sh Running on the system: Linux vdso 3.13.11.2-custom #3 SMP Sat Jun 14 12:36:54 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux Before fix Hello, it's 2014-06-17 00:07:11.235963278 -0700 PDT Go runtime: go1.2.2 After fix Hello, it's 2014-06-17 00:07:11.240788594 -0700 PDT Go runtime: devel +2dc2bf98e6e3 Mon Jun 16 16:32:47 2014 -0700 Thanks, Andy, Ian, Russ. |
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jun 25, 2018
Reportedly in the Linux 3.16 kernel the VDSO will not have section headers or a normal symbol table. Too late for 1.3 but perhaps for 1.3.1, if there is one. Fixes golang#8197. LGTM=rsc R=golang-codereviews, mattn.jp, rsc CC=golang-codereviews https://golang.org/cl/101260044
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
by amluto:
The text was updated successfully, but these errors were encountered: