Resource leak in socket and ssl transports #218

GoogleCodeExporter · 2015-04-10T16:46:41Z


waiter future handling in socket and ssl transport is broken. in case of 
cancelled waiter both transports continue to work, but cancelled waiter means 
caller is not interested in connection anymore. for example wait_for() with 
create_connection may leak sockets, if ssl handshake takes more time than 
timeout. proper behavior for cancelled waiter should be removing all read/write 
operation and close socket.

Original issue reported on code.google.com by fafhr...@gmail.com on 11 Jan 2015 at 5:49

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-04-10T16:46:41Z

related to #58

Original comment by fafhr...@gmail.com on 11 Jan 2015 at 5:50

GoogleCodeExporter · 2015-04-10T16:46:42Z

Do you have examples to reproduce the issue?

Original comment by victor.s...@gmail.com on 14 Jan 2015 at 12:52

GoogleCodeExporter · 2015-04-10T16:46:42Z

i don't have example, but i have this issue in production system

Original comment by niko...@getkeepsafe.com on 14 Jan 2015 at 2:07

GoogleCodeExporter · 2015-04-10T16:46:42Z

I recently fixed a bug related to this issue:
"kill the subprocess if the creation failed"
http://bugs.python.org/issue23173
change: 
https://code.google.com/p/tulip/source/detail?r=31b0c15d8a4d7213a514f25d848d5f00
336083b4

The bug is similar because on error during subprocess creation, the caller 
doesn't have access to the subprocess and so it cannot kill it. It is now done 
automatically.

> i don't have example, but i have this issue in production system

Ok, I found a way to find such bug: I wrote a patch to "emit ResourceWarning 
warnings if transports/event loops are not explicitly closed"
http://bugs.python.org/issue23243

Using this patch, I already found one obvious bug in create_connection():
https://code.google.com/p/tulip/source/detail?r=3c53308c720a4043eb165d67afe8d182
b1f1e1ec
"Fix BaseEventLoop._create_connection_transport(). Close the transport if the 
creation of the transport (if the waiter) gets an exception."

I also opened another issue related to this one:
http://bugs.python.org/issue23242
"Process must close the transport when the process exit"

Original comment by victor.s...@gmail.com on 15 Jan 2015 at 8:25

GoogleCodeExporter · 2015-04-10T16:46:42Z

I made a change in StreamWriter to clear its reference to the transport when 
the transport is closed. But I had to revert the change because it changed the 
behaviour of Process.stdin.write(): it raised RuntimeError instead of 
BrokenPipeError or ConnectionResetError when stdin is closed.

I don't know if it's possible to clear the reference without changing the 
behaviour. Otherwise, the issue must be documented in a comment in 
StreamWriter.close().

Original comment by victor.s...@gmail.com on 15 Jan 2015 at 10:26

GoogleCodeExporter · 2015-04-10T16:46:42Z

The following Tulip change (not merged in Python yet) should also fix even more 
resource leaks on cancellation:

changeset:   1470:10b8730112c6
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Fri Jan 23 00:52:06 2015 +0100
files:       asyncio/base_events.py
description:
Close transports on error

Fix create_datagram_endpoint(), connect_read_pipe() and connect_write_pipe():
close the transport if the task is cancelled or on error.

Original comment by victor.s...@gmail.com on 23 Jan 2015 at 1:29

GoogleCodeExporter · 2015-04-10T16:46:42Z

@fafhrd91, @nikolay: Hi, I fixed many issues related to this bug report. Could 
you please try the development version of asyncio to check if your resource 
leaks are gone?

Without any script to reproduce the issue, this bug report is not very useful.

The ResourceWarning patch may help you to identify bugs in your code or bugs in 
asyncio.
http://bugs.python.org/issue23243

Original comment by victor.s...@gmail.com on 27 Jan 2015 at 12:34

GoogleCodeExporter · 2015-04-10T16:46:43Z

I just closed http://bugs.python.org/issue23243

There is a remaining issue related to subprocesses: 
http://bugs.python.org/issue23347

I close this issue because there is no scenario to reproduce the bug ("i have 
this issue in production system" is not a valid scenario).

Please open a thread on the Tulip mailing list if you want help on the topic. 
Please also test the latest development version of asyncio: I fixed a lot of 
resource leaks and added destructors to emit warnings when event loops or 
transports are not explicitly closed. Destructors are currently only enabled on 
Python 3.4 and newer to not create objects which cannot be destroyed by the 
garbage collector (see the PEP 442).

Thanks for the feedback! It helped me to identify and fix a lot of bugs.

Original comment by victor.s...@gmail.com on 29 Jan 2015 at 4:59

Changed state: Fixed

GoogleCodeExporter added auto-migrated Priority-Medium Type-Defect labels Apr 10, 2015

GoogleCodeExporter closed this as completed Apr 10, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource leak in socket and ssl transports #218

Resource leak in socket and ssl transports #218

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

Resource leak in socket and ssl transports #218

Resource leak in socket and ssl transports #218

Comments

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015

GoogleCodeExporter commented Apr 10, 2015