New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: Default{Client,Transport} need better timeout support when used with connection pooling/redirects #3362
Labels
Comments
heres one way to do it (although admittedly it is not zestfully *clean*) https://github.com/mreiferson/go-httpclient |
Is there any activity on this issue? I'd like to volunteer my time if it's needed. I would consider this a major stability bug for long running services that make http requests. A solid http client with timeout support should be considered a critical part of a "batteries included" standard library. |
There are at least three timeouts that make sense. We may even want all three: // HeaderTimeout is the amount of time to wait to get a response header // after writing the request. It does not include the response body. // Zero means no timeout. HeaderTimeout time.Duration // ResponseTimeout is like HeaderTimeout, but includes the time to transmit // the entire body, regardless of size. If the server is still sending // the body as ResponseTimeout is hit, the body is interrupted and a Read // will result in ErrTimeout. ResponseTimeout time.Duration // InactivityTimeout is the amount of time to get any data from the server // before considering it timed out. Unlike HeaderTimeout and ResponseTimeout, // this timer resets every time a byte of data is received from the server, // so a slow trickle of data could take a large amount of time. // Zero means no timeout. InactivityTimeout time.Duration |
The common use case is probably centered around what bradfitz has labeled as the ResponseTimeout. HeaderTimeout and InactivityTimeout seem to be relatively uncommon, and would require that a particular request has claimed a connection as its own out of a pool until the corresponding timeout has passed or the request has completed. Its not unrealistic to expect applications that rely on this level of detail to implement their own RoundTripper, as connection pooling may not be as critical for APIs that have disparate requirements around the timing of header responses and byte level responses. The way I envisioned this was to simply have the timeout based on the total running time of a call to Tranport.RoundTrip, including any redirects. This is relatively straightforward and should meet the needs of most applications. |
HeaderTimeout and InactivityTimeout are bounded by the lifetime of a round trip. There's no need for a request to make a claim on the connection beyond the claim made by the round trip. Streaming APIs like Twitter's are a use case for HeaderTimeout and InactivityTimeout. With these APIs, the client receives a stream of data and keep-alive messages in the response. The header timeout detects problems connecting to the server. The inactivity timeout detects the case where the client is not receiving data or keep-alive messages. The read deadline can be set via the Transport Dial function, but not in ways that the application might want. For example, it's not possible to implement ResponseTimeout when a connection is reused. |
I actually don't like the name ResponseTimeout anymore, because we name the result parameter from Do/Get/Post (*Response, error), and I don't want people assuming that ResponseTimeout means that Response. BodyTimeout doesn't work well as an alternative to the name ResponseTimeout, because it implies that it's only for a body, not the headers + an optional body. So I'm leaning towards shortening "ResponseTimeout" to just Timeout. So we'll have: 1) Timeout (the whole thing) 2) HeaderTimeout (just the headers) 3) IdleTimeout (what I called InactivityTImeout above) That should be sufficient. If we really need more control later, we could provide a method on *Response (assuming you get a *Response after the HeaderTimeout) to plumb through a deadline to the underlying connection, but I don't think it will be necessary and I don't want to do it now. I recognize that 1) and 3) would be unnecessary if we had 2) and a *Response SetDeadline method, but I'd prefer to provide solutiosn to the common use cases, rather than provide a bunch of mechanisms. Any final thoughts or objections before I start writing this? |
More questions: does the HeaderTimeout begin once the request has been written (including any POST body), or at the beginning of the RoundTrip? Both seem wrong (or at least potentially surprising) for different reasons. Let's assume you don't set the net.Conn's ReadDeadline until after the request has been sent. In most cases, this will be fine, but if the remote server disappears off the network and you write a large HTTP request body such that Write blocks, the RoundTrip will never timeout if it never gets to http.ReadRequest -> bufio.Read -> net.Conn.Read. But if you include the time to send a request body, then some POST requests are now too large to be sent in time. And the name "HeaderTimeout" is now confusing. So back to assuming HeaderTimeout is as documented above, and only for reading the response headers: really the user would need to set both IdleTimeout (for the request write phrase, say 5 seconds to get TCP progress) + HeaderTimeout (say, 30 seconds). That seems annoying to document. Do we want to make a zero IdleTimeout mean the minimum of HeaderTimeout and Timeout? So if the user only specifies a 30 second HeaderTimeout, that also means that at least the server needs to respond to TCP during the request write phase at least once every 30 seconds. That would prevent offline servers from hanging a Go program. None of this is hard to implement. The only hard part of this bug is naming and documentation. Owner changed to @bradfitz. |
Perhaps some of the complexity here is because of the attempt to simplify to fewer variables than there are phases in an HTTP request. It seems like we're asking this proposal to satisfy two very different scenarios. 1) typical HTTP requests between services (regardless of size of request/response body) where you want to control expected performance of both sides... sending (connect + write)/response (headers + body) 2) streaming requests where connection/request write/response header/idle timeouts want to be bounded but there should be no overall timeout to the request. It seems like one answer might be to cover each of the phases with a timeout and allow the sum of their parts to compose the behavior one would expect (each being optional): // set on any read/write to guarantee TCP progress IdleTimeout time.Duration // includes connection time (may or may not be cached) + writing request (headers + body) SendTimeout time.Duration // includes time spent waiting for response headers after sending the request HeaderTimeout time.Duration // includes time spent waiting for the response body after reading response headers BodyTimeout time.Duration However, the counter argument is that this is too much control and all you really want is to be able to bound the overall request time (at all) regardless of which part is taking the bulk of the time. Although, for production debugging I could imagine there being a huge benefit to having the granularity above to understand/control what phase isn't performing. This could be resolved independently if the library provided means to introspect the recorded time spent in each phase of the request. I tend to side with others comments above that this change should perhaps *not* satisfy the needs of streaming/long lived requests (being less common and having very specific requirements). I feel like those may be better implemented with a custom RoundTripper, etc. IMO, you could drop IdleTimeout from your proposal entirely without making much difference. In that case, my vote would be for either of the following: SendTimeout, HeaderTimeout, and BodyTimeout (as described above) or ConnectTimeout, Timeout (which includes all phases of the request) For the record, I realize the proposed suggestion hasn't been including connect time. I think that the goal of this change is to make the standard library httpclient easy and obvious to use (for this use case) and not including connect time would result in unexpected behavior. In particular, supporting request timeouts that don't include connect time (and thus still require a custom Dial function to get DialTimeout) would feel a bit odd and inconsistent. Perhaps documentation could resolve this, though. |
There seems to be a serious challenge to associating timeouts to different phases of the request when there aren't exposed hooks to the associated phases. What if the semantic issue is avoided by bunching the concerns of the call to RoundTrip into a single timeout? // RoundTripTimeout is the amount of time that RoundTripper.RoundTrip will // block before returning a timeout error. This includes, but is not limited // to, the time that it may take to connect, send the request, any request // body and receive any associated headers. The phase in which the request // times out is returned as part of the timeout error. RoundTripTimeout time.Duration // ReadTimeout is the amount of time that calls to Read on Response.Body will // block before returning a timeout error. ReadTimeout time.Duration The main advantages to this are that the abstraction is maintained, issues with down or overloaded servers are addressed within the bounds of RoundTripTimeout, and there is a reasonable hook to ensure that Body.Read doesn't block goroutines forever. In addition, timeouts are defined by the boundaries of the API function calls, rather than by the boundaries of various phases of the request, which is much easier to test, verify and reason about. At the same time, the actual causes of the timeouts can be inspected via the returned errors of each associated function call, be it RoundTripper.RoundTrip or Response.Body.Read. |
I've sent https://golang.org/cl/7369055 to add ResponseHeaderTimeout. It's purely the time after writing the full request (including optional body), until we get response headers. I'm focusing first on the primitives you can't do otherwise. Dial timeouts can already be done with the Transport.Dial hook. Timeouts reading the response.Body can be done purely by Go code, especially once Issue 3672 is fixed to just kill the TCP connection. The one that remains then is timeouts writing the request Body, where we might want something like IdleTimeout. I'm also considering whether we might want: func (t *Transport) StopRequest(r *Request) ... so people implementing timeouts in their own programs have a reliable mechanism to shut down a request's TCP connection, since the Transport knows which request went to which connection. If we had that, maybe we wouldn't need the request Body timeout. |
Can we instead wait on the fact that there is a request sent on that conn instead of sitting in a Peek()? This would allow the implementation to be simplified, standard library side, by setting the ReadDeadline on the net.Conn before waiting on a Read (and wouldn't incur a goroutine spawn per request which is expensive at high volume). Was sitting in Peek() used to attempt detection of connections that are closed on you? Isn't that sorta racey? |
This issue was updated by revision 839d47a. R=golang-dev, adg, rsc CC=golang-dev https://golang.org/cl/7369055 |
Re #24: we're only sitting in a Peek when the connection is idle, or when we're anticipating a response. We need to be actively reading from the connection to know when a peer closes its keep-alive connection on us so we don't re-use it. It works most of the time but yes, it's inherently racy, but that's the protocol and reality's fault, no Go's. See issue #4677. |
CL out for review to add Transport.CancelRequest: https://golang.org/cl/7372054/ |
This issue was updated by revision 11776a3. Update issue #3672 R=golang-dev, rsc, adg CC=golang-dev https://golang.org/cl/7372054 |
Now that we have revision 02ee8f7b9a17, revision bcecef1cb2fd, and revision d4e1ec84876c in, I'm tempted to change this bug to not be Go1.1 anymore. I think we now have all the necessary low-level mechanisms to let Go programs implement their own policies, without adding dozens of new knobs. If anybody wants to argue for things we should have for Go 1.1 (either low-level mechanisms or higher-level knobs), speak now. The cut-off date is approaching. |
I'd like to make a case for keeping this, since it seems that it may get lost. The primary issue I have with the CancelRequest method, is that if it is used for a timeout, the caller gets a "use of closed network connection" error. I strongly feel that a timeout situation should return a net.Error with TImeout() == true, which would be the case when any of the other deadlines are reached. Another more minor detail is that this requires one to setup a timer of some sort for each request. This would idiomatically be in another goroutine with a time.Timer, which adds control channels and goroutine overhead for each call. |
With respect to dial timeouts and the `Transport.Dial` hook, there's an edgecase not covered when dealing with HTTPS requests. Using a custom dial method to invoke `DialWithTimeout` works for a connect timeout, but none of the TLS handshaking has happened when that returns. Currently none of the timeouts address a TLS handshake timing out. If a connection is to be used for exactly one request, a deadline can be set, but with re-use of working connections, I don't think there's a way to update the deadline before every HTTP request; unless I'm mistaken, the underlying `net.Conn` structures are never exposed outside of the http package other than through the custom Dial function. |
Issue raised on golang-dev: http://goo.gl/Wqhhqy |
This issue was updated by revision 49beb23. Also set a 30 second timeout, instead of relying on the operating system's timeout, which if often but not always 3 minutes. LGTM=crawshaw R=rsc, crawshaw CC=golang-codereviews https://golang.org/cl/68330046 |
This issue was updated by revision fd4b4b5. LGTM=agl R=agl CC=golang-codereviews https://golang.org/cl/68150045 |
I've made two changes for Go 1.3 for this issue: 1) TCP Keep-Alives are now used by default, so at least dead connections will eventually give a read error and go away. 2) There is now a TLSHandshakeTimeout and is set by default. I picked 10 seconds for DefaulTransport pretty arbitrarily and conservatively. The issue has gotten to the point where it's almost too large to read anymore and it's mostly musing and not proposals. If somebody could summarize here what's still missing that's not doable outside of net/http, that would be helpful. I believe timeouts spanning requests involving redirects are doable with the Client.CheckRedirect hook to make a map from original request => subsequent requests, and then using Transport.CancelRequest. To add to the musing, we could add: type Client struct { // .... // Timeout, if non-zero, gives the total time allowed for all phases of a request // including any of its subsequent redirect requests. This includes the connection // time, writing the request(s), and reading the entire response body(ies). // If the underlying RoundTripper implementation of the Client's Transport // does not support CancelRequest, this Timeout is ignored. Timeout time.Duration } But like the comment says, this would be based on CancelRequest and tracking redirects to cancel the right request when the timeout occurs, so it could be confusing to people if used with a RoundTripper that doesn't support CancelRequest. Maybe that's an error, though, and we fail fast. Do we then need a public interface for RequestCanceler, or can it be unexported? This support would also wrap the return Response Body's io.ReadCloser to one that note the EOF/Close and stops the timer. I don't see the need to rush that in, unless everybody agrees that it's needed, especially since it can be implemented by hand anyway. Unless there are specific suggestions this week, we're probably done for Go 1.3. Please speak up soon if you want more. |
Thanks for the recent improvements, Brad. It would be superb if we could have such a straightforward Timeout parameter as you suggest, as I've been observing broken usage of the http package every other day. The fact we can say "you can tune it yourself to have sane behavior" feels like a departure from what we preach elsewhere, and people find they're missing details in the worst possible way. Unprompted conversation from my IRC log, *today*, as evidence: Feb 25 11:22:40 niemeyer: (...) this is the branch I was talking about, setting up http Client is around line 232... Feb 25 11:25:43 pedronis: The code in there is even more problematic Feb 25 11:25:58 pedronis: As it's timing out only if the connection fails altogether Feb 25 11:26:15 niemeyer: well it fails if the header are not sent in time Feb 25 11:26:24 pedronis: Not really Feb 25 11:26:43 niemeyer: you are telling me that the client code is completely broken? Feb 25 11:26:50 pedronis: Yes, kind of.. sorry Feb 25 11:27:19 pedronis: Please bitch here: http://golang.org/issue/3362 |
yes, Thank you Brad. I'm still going to push for the overall Timeout setting. The timeout method used by github.com/mreiferson/go-httpclient is now pretty slick, but it definitely not obvious for most how that would been implemented (took a couple iterations in that library itself). We could better promote helper libs like this, but people are always going to miss out on them. I also still would like to see a actual Timeout error returned instead of the closed connection error, but I think the proper reference implementation is still more important. |
I'm also in favor of an end to end timeout. In an effort to be clear about some edge cases, e.g. redirection, what's most important in my applications is when using `keep-alive` and calling `response, err := httpClient.Get(...)` I can express a guarantee that the calling code will eventually have control returned to it. Meaning connection creation, TLS handshaking, redirects, POST data to send (obviously not for a `Get`) and receiving the body from the server needs to be covered by a single, or some combination of timeouts. |
Sent https://golang.org/cl/70120045 Please review. |
This issue was closed by revision 2ad72ec. Status changed to Fixed. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: