Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: unable to use SIGSTOP to background #4391

Closed
gopherbot opened this issue Nov 15, 2012 · 20 comments
Closed

runtime: unable to use SIGSTOP to background #4391

gopherbot opened this issue Nov 15, 2012 · 20 comments

Comments

@gopherbot
Copy link

If I try to SIGSTOP the current process, that seems to only block one thread, while
other threads continue. I also tried signaling the session leader, same results.

This reproduces the problem:

$ cat sigstop.go
package main

import (
    "syscall"
    "fmt"
)

func main() {
    fmt.Println("one")
    pid := syscall.Getpid()
    err := syscall.Kill(pid, syscall.SIGSTOP)
    if err != nil {
        panic("bleh")
    }
    fmt.Println("two")
}

$ go build sigstop.go
$ ./sigstop
one
two

What is the expected output?

Compare to

$ cat sigstop.c
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>

int main(void) {
  pid_t pid;

  printf("one\n");
  pid = getpid();
  kill(pid, SIGSTOP);
  printf("two\n");
}

$ gcc sigstop.c
$ ./a.out 
one

[1]+  Stopped                 ./a.out
$ fg
./a.out
two
$ 

Which compiler are you using (5g, 6g, 8g, gccgo)?

6g via "go build"

Which operating system are you using?

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 12.04.1 LTS
Release:    12.04
Codename:   precise

Which version are you using?  (run 'go version')

$ go version
go version go1
@gopherbot
Copy link
Author

Comment 1:

This makes the backgrounding mostly work:
package main
import (
    "syscall"
    "fmt"
)
func main() {
    fmt.Println("one")
    pid := syscall.Getpid()
    tid := syscall.Gettid()
    err := syscall.Tgkill(pid, tid, syscall.SIGSTOP)
    if err != nil {
        panic("bleh")
    }
    fmt.Println("two")
}
except now any other threads that happen to be trying to read /dev/tty are busy looping
getting endless SIGTTIN. Although that may be an artifact of me using strace to look at
them.

@gopherbot
Copy link
Author

Comment 2:

Alright so I said "mostly". That's because in a real app (my fork of godit, with suspend
support https://github.com/tv42/godit/tree/suspend -- depends on my fork of termbox-go),
if I look at the process with `strace -ff -p "$(pidof godit)"`, after I hit control-Z,
the rest of the threads seem to start busy looping.
I don't currently know if this is triggered purely by strace; the CPU usage doesn't
spike significantly.
Also, this might just mean the app needs to catch SIGTTIN -- but what should it do?
[pid 23688] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23688] restart_syscall(<... resuming interrupted call ...> <unfinished
...>
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23690] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23691] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23690] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] futex(0xf84005fc28, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23690] futex(0x586678, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23689] read(4,  <unfinished ...>
[pid 23688] <... restart_syscall resumed> ) = ? ERESTART_RESTARTBLOCK (To be
restarted)
[pid 23689] <... read resumed> 0xf8400cf000, 128) = ? ERESTARTSYS (To be restarted)
[pid 23688] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23688] restart_syscall(<... resuming interrupted call ...> <unfinished
...>
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23690] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23691] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23690] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] futex(0xf84005fc28, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23690] futex(0x586678, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23689] read(4,  <unfinished ...>
[pid 23688] <... restart_syscall resumed> ) = ? ERESTART_RESTARTBLOCK (To be
restarted)
[pid 23689] <... read resumed> 0xf8400cf000, 128) = ? ERESTARTSYS (To be restarted)
[pid 23688] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23688] restart_syscall(<... resuming interrupted call ...> <unfinished
...>
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23690] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23691] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23690] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] futex(0xf84005fc28, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23690] futex(0x586678, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23689] read(4,  <unfinished ...>
[pid 23688] <... restart_syscall resumed> ) = ? ERESTART_RESTARTBLOCK (To be
restarted)
[pid 23689] <... read resumed> 0xf8400cf000, 128) = ? ERESTARTSYS (To be restarted)
[pid 23688] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23688] restart_syscall(<... resuming interrupted call ...> <unfinished
...>
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23690] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23691] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23690] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] futex(0xf84005fc28, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23690] futex(0x586678, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 23689] read(4,  <unfinished ...>
[pid 23688] <... restart_syscall resumed> ) = ? ERESTART_RESTARTBLOCK (To be
restarted)
[pid 23689] <... read resumed> 0xf8400cf000, 128) = ? ERESTARTSYS (To be restarted)
[pid 23688] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23688] restart_syscall(<... resuming interrupted call ...> <unfinished
...>
[pid 23689] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23690] <... futex resumed> )       = ? ERESTARTSYS (To be restarted)
[pid 23691] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23690] --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
[pid 23691] futex(0xf84005fc28, FUTEX_WAIT, 0, NULL <unfinished ...>

@nsf
Copy link

nsf commented Nov 16, 2012

Comment 3:

SIGTTIN is out fault actually I think. Because we should not make read attempts while
the process is stopped. That's what SIGTTIN means it is a /dev/tty read attempt from a
process that was stopped by SIGSTOP.
There are a couple of problems as well actually. For example Go sets all the signal
handlers with SA_RESTART, but frankly it would be nice to have an ability to interrupt a
syscall by a signal. SIGTTIN is one the cases for that.
Anyways, I think we can workaround all that if we try really hard, without bug reporting
on Go a lot. :)

@nsf
Copy link

nsf commented Nov 16, 2012

Comment 4:

Tommi, actually I've made it work, check it out: `go get -u github.com/nsf/godit`.

@rsc
Copy link
Contributor

rsc commented Dec 10, 2012

Comment 6:

Labels changed: added size-l.

@gopherbot
Copy link
Author

Comment 7:

Sorry for lack of updates. The SIGTTIN strace in the above is caused by the app in
question having goroutines trying to still read from the tty. Every read attempt gets a
SIGTTIN, and it keeps spinning there. While the app *could* be careful and stop the
reading goroutines (which isn't trivial in the first place; they're blocking on .Read()
currently), very few apps will have custom code to handle SIGTSTP in the first place, so
I expect C-z'ing miscellaneous Go applications will have messy results. A goroutine
blocking on os.Stdin.Read() oughta behave better, with no extra logic.
Also, I'm not convinced the syscall.Gettid and .Tgkill hack is guaranteed to work in
general; I don't think I saw anything in the ref saying goroutines can't reschedule
whenever they want. The OS thread id might change, unless we do runtime.LockOSThread(),
and that's really only useful for new goroutines (because we don't want to leave things
locked).
Not sure what would be a clean way out of this mess.

@rsc
Copy link
Contributor

rsc commented Dec 30, 2012

Comment 8:

Labels changed: added priority-later, removed priority-triage.

Status changed to Accepted.

@rsc
Copy link
Contributor

rsc commented Feb 15, 2013

Comment 9:

Yuck. See also issue #4494.

@nsf
Copy link

nsf commented Feb 15, 2013

Comment 10:

Hey, guys, is this issue still relevant? I mean the given example was based on godit's
(github.com/nsf/godit - emacs-like text editor written in Go) and it works in godit. You
can do `C-z` and then `fg` back. So, really, it works isn't it?

@gopherbot
Copy link
Author

Comment 11:

To comment #10: godit now has the kludge from comment #1; I'm not convinced the Tgkill
is not racy; see comment #7.
A quick strace of a suspended godit shows threads still spinning with SIGTTIN/SIGTTOU,
so that part is not resolved either.

@nsf
Copy link

nsf commented Feb 28, 2013

Comment 12:

> A quick strace of a suspended godit shows threads still spinning with SIGTTIN/SIGTTOU,
so that part is not resolved either.
Nothing like that happens on my machine. I can only see SIGIO (since we're using that
now in termbox) when godit is active (in foreground), after C-z strace shows no activity.
Have you updated the termbox-go library? Try the full update: `go get -u
github.com/nsf/godit`. It can't be an OS specific issue as C-z works only under linux .

@rsc
Copy link
Contributor

rsc commented Mar 12, 2013

Comment 13:

We don't have time to get intraprocess signals correct before Go 1.1. Postponing this,
sorry.

Labels changed: added go1.2, removed go1.1.

@rsc
Copy link
Contributor

rsc commented Jul 30, 2013

Comment 14:

Labels changed: added feature.

@robpike
Copy link
Contributor

robpike commented Aug 15, 2013

Comment 15:

Moving to Go1.3. This won't make it into 1.2. Sorry.

Labels changed: added go1.3, removed go1.2.

@robpike
Copy link
Contributor

robpike commented Aug 20, 2013

Comment 16:

Labels changed: removed go1.3.

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 17:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 18:

Labels changed: removed feature.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 19:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 20:

Labels changed: added repo-main.

@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@ianlancetaylor
Copy link
Contributor

This works as expected in Go 1.5 and on tip.

@golang golang locked and limited conversation to collaborators Dec 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants