Export to GitHub

lusca-cache - issue #89

Proposal, SO_REUSEADDR before connect


Posted on Mar 3, 2010 by Grumpy Lion

I have proxies with more than 10-20k outbound connections. At peak time i am having something like: 2010/02/03 21:24:49| commBind: Cannot bind socket FD 3892 family 2 to 0.0.0.0 port 0: (98) Address already in use

According: http://www.ibm.com/developerworks/linux/library/l-sockpit/

and according Linux man: " Linux will only allow port re-use with the SO_REUSEADDR option when this option was set both in the previous program that performed a bind(2) to the port and in the program that wants to re-use the port. This differs from some implementations (e.g., FreeBSD) where only the later program needs to set the SO_REUSEADDR option. Typically this difference is invisible, since, for example, a server program is designed to always set this option. "

There is sure system-wide solution also (sometimes required as addon to mentioned socket option) net.ipv4.tcp_tw_recycle net.ipv4.tcp_tw_reuse

Both of them helping. Probably it is good idea to implement this in code with ifdef to LINUX?

I guess in comm_fdopen6(int new_socket,

to move out commSetReuseAddr(new_socket); (by default it is being set i guess only to listening ports, i hope it is harmless in case of UDP).

Comment #1

Posted on Mar 20, 2010 by Happy Horse

Hm. Wait, so what you're saying is that Linux will benefit from having SO_REUSEADDR set on -all- incoming and outbound connections?

Comment #2

Posted on Mar 20, 2010 by Grumpy Lion

Yes, because in Linux for outgoing connection even, socket will stay in TIME_WAIT for some time. This means port cannot be reused at this time. Maybe knob in config? Because in some cases this option, if enabled, can have as i understand negative, security issues.

I guess it is also better to mention in manuals or somewhere, if someone see this error message, he should take a look also to: net.ipv4.tcp_tw_recycle net.ipv4.tcp_tw_reuse

Comment #3

Posted on Oct 17, 2010 by Swift Hippo

SO_REUSEADDR is generally only useful where something bind()'s to a port for listening, and one needs to rebind quickly after a process restart or crash.

I have no evidence that this is at all useful for outbound connections. In fact, the only way I know to control that behaviour is to use the tw_recycle bits, or tweak the tcp fin/ack/syn timeouts.

Comment #4

Posted on Nov 4, 2010 by Swift Hippo

I can confirm that the net.ipv4.tcp_tw_recycle=1 setting is a general cure for this.

Comment #5

Posted on Nov 10, 2010 by Grumpy Lion

tw_recycle harmful for NAT, i had a lot of situations, when you set this knob - NAT clients will not be able to work normally with proxy (stalled connections, or tcp connection was unable to establish, i dont remember, tried that recently).

My solution was seems was in other parameter(?). net.ipv4.tcp_orphan_retries by default was 0, when i set it to 1 - my problem disappeared.

Comment #6

Posted on Nov 10, 2010 by Grumpy Rabbit

tcp_orphan_retries didn't work for me.

Though /prov/sys/net/ipv4/tcp_tw_reuse works, although I get a lot of:

TCP: time wait bucket table overflow

Comment #7

Posted on Nov 10, 2010 by Grumpy Rabbit

By the way, I don't have that many connections, mgr:info shows this:

Squid Object Cache: Version LUSCA_HEAD-r14756 Start Time: Wed, 10 Nov 2010 19:41:29 GMT Current Time: Wed, 10 Nov 2010 21:06:06 GMT Connection information for Squid: Number of clients accessing cache: 4465 Number of HTTP requests received: 3892521 Number of ICP messages received: 0 Number of ICP messages sent: 0 Number of queued ICP replies: 0 Request failure ratio: 0.03 Average HTTP requests per minute since start: 45995.1 Average ICP messages per minute since start: 0.0 Select loop called: 11358493 times, 0.447 ms avg Cache information for Squid: Request Hit Ratios: 5min: 0.0%, 60min: 0.0% Byte Hit Ratios: 5min: -3.2%, 60min: -2.6% Request Memory Hit Ratios: 5min: 0.0%, 60min: 0.0% Request Disk Hit Ratios: 5min: 0.0%, 60min: 0.0% Storage Swap size: 0 KB Storage Mem size: 179180 KB Mean Object Size: 0.00 KB Requests given to unlinkd: 0

Is there something I can try to fix this?

"commBind: Cannot bind socket FD 17164 family 2 to 0.0.0.0 port 0: (98) Address already in use"

I'm getting these around 232 times per-second on cache.log

Comment #8

Posted on Nov 15, 2010 by Grumpy Lion

half_closed_clients off ?

Comment #9

Posted on Nov 16, 2010 by Happy Monkey

Can agree with nuclearcat regards: net.ipv4.tcp_tw_recycle=1 being harmful. In my tproxy setup I got the exact behaviour, stalling, connect failures etc.

nuclearcat: Are you running a tproxy setup ?

Comment #10

Posted on Nov 16, 2010 by Happy Monkey

Er, never mind if you're getting commBind errors, then likely not.

Comment #11

Posted on Nov 16, 2010 by Happy Monkey

what does mgr:curcounters show ?

Comment #12

Posted on Nov 16, 2010 by Happy Monkey

Just furthering research on the topic in the linux kernel:

From: inet_bind() in http://lxr.linux.no/linux+v2.6.36/net/ipv4/af_inet.c#L511

ERRINUSE will be returned if inet_csk_get_port() from http://lxr.linux.no/linux+v2.6.36/net/ipv4/inet_connection_sock.c#L120 was unable to find a free socket to bind() to from the bind bucket.

There are some checks in inet_csk_get_port() to see if sk->sk_reuse has been set, which is controlled with the setsockopt(SO_REUSEADDR) so this request may actually contain some value.

Can you try it with a patch against comm.c such as:

} else if (! sqinet_is_noaddr(&F->local_address)) { + commSetReuseAddr(new_socket); if (commBind(new_socket, &F->local_address) != COMM_OK) { comm_close(new_socket); return -1; } } F->local_port = sqinet_get_port(a);

Comment #13

Posted on Nov 18, 2010 by Swift Hippo

OK, having played with the above patch, it didn't really make a major difference when apachebenching either a transparent, or tproxied lusca.

What did help immensely was the following: net.ipv4.tcp_max_orphans = 8192 net.ipv4.tcp_orphan_retries = 1

Status: New

Labels:
Type-Defect Priority-Medium Version-1.0