Discussion:
generating ECONNRESET with no syscall?
(too old to reply)
Edgar Fuß
2016-04-20 12:52:20 UTC
Permalink
Is it, for a user process, legally possible to put one of its socket into a
state where trying to connect to it returns ECONNRESET without the process
issuing any syscalls during the connection attempt?

We have two production web servers, both running lighttpd, the first still
running on NetBSD 4.0.1, the second on 6.1. Occasioally, on the second, you
get Connection Reset by Peer (NOT Connection Refused) on the HTTP port.
ktrace-ng the lighttpd process reveals close to no activity. Restarting
lighttpd instantly resolves the problem. Is there any way the lighttpd process
may have put its socket into this condition or must this be a kernel issue?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
J. Lewis Muir
2016-04-20 15:00:06 UTC
Permalink
Post by Edgar Fuß
Is it, for a user process, legally possible to put one of its socket
into a state where trying to connect to it returns ECONNRESET without
the process issuing any syscalls during the connection attempt?
We have two production web servers, both running lighttpd, the first
still running on NetBSD 4.0.1, the second on 6.1. Occasioally, on the
second, you get Connection Reset by Peer (NOT Connection Refused)
on the HTTP port. ktrace-ng the lighttpd process reveals close to
no activity. Restarting lighttpd instantly resolves the problem. Is
there any way the lighttpd process may have put its socket into this
condition or must this be a kernel issue?
Hi, Edgar.

What version of lighttpd is it? Is there a firewall on the server or
elsewhere that could be unexpectedly sending the TCP reset? Is CGI or
similar involved, or does it happen even when serving static content?

Regards,

Lewis

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-04-20 15:38:23 UTC
Permalink
Post by J. Lewis Muir
What version of lighttpd is it?
1.4.35nb2.
Post by J. Lewis Muir
Is there a firewall on the server
Yes.
Post by J. Lewis Muir
or elsewhere
No.
Post by J. Lewis Muir
that could be unexpectedly sending the TCP reset?
Nothing in the Log. Also, why should restarting lighttpd resolve that?
Post by J. Lewis Muir
Is CGI or similar involved, or does it happen even when serving static
content?
???
You get a Connection reset by peer when telnet'ing to the servers HTTP port.
When you ktrace lighttpd at that time, you see don't see any activity on
the socket (or rather, almost no activity at all).

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
J. Lewis Muir
2016-04-20 17:44:38 UTC
Permalink
Post by Edgar Fuß
Post by J. Lewis Muir
What version of lighttpd is it?
1.4.35nb2.
Post by J. Lewis Muir
Is there a firewall on the server
Yes.
Post by J. Lewis Muir
or elsewhere
No.
Post by J. Lewis Muir
that could be unexpectedly sending the TCP reset?
Nothing in the Log. Also, why should restarting lighttpd resolve that?
OK, maybe it shouldn't. I was thinking this happened during a session,
but, rereading your initial post, I understand it happens when the
client *tries* to connect.
Post by Edgar Fuß
Post by J. Lewis Muir
Is CGI or similar involved, or does it happen even when serving
static content?
???
You get a Connection reset by peer when telnet'ing to the servers HTTP
port. When you ktrace lighttpd at that time, you see don't see any
activity on the socket (or rather, almost no activity at all).
OK, I see; I was mistakenly thinking it happened during a session, but I
understand now that it is happening when the client *tries* to connect.

Lewis

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-04-20 19:14:27 UTC
Permalink
Date: Wed, 20 Apr 2016 17:38:23 +0200
From: Edgar =?iso-8859-1?B?RnXf?= <***@math.uni-bonn.de>
Message-ID: <***@trav.math.uni-bonn.de>

| You get a Connection reset by peer when telnet'ing to the servers HTTP port.
| When you ktrace lighttpd at that time, you see don't see any activity on
| the socket (or rather, almost no activity at all).

Sounds like connection (listen) queue full to me. For whatever reason
the server isn't accepting pending connections - the port exists, so
TCP won't simply reject (reset) the initial SYN, but when the handshake is
complete, there's no space left for this connection to be queued, so TCP
resets it at that point.

kre


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-04-20 19:20:25 UTC
Permalink
Post by Robert Elz
Sounds like connection (listen) queue full to me.
Ah, thanks!
Any way to list that queue?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-04-22 09:42:48 UTC
Permalink
Post by Robert Elz
Maybe premature.
Hm.
So, anyone else having an idea if a misguided user process could achieve this?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-04-20 20:42:27 UTC
Permalink
Date: Wed, 20 Apr 2016 21:20:25 +0200
From: Edgar =?iso-8859-1?B?RnXf?= <***@math.uni-bonn.de>
Message-ID: <***@trav.math.uni-bonn.de>

| Ah, thanks!

Maybe premature.

| Any way to list that queue?

Not that I know of - I think it is one of the completely overlooked
parts of the networking stack (perhaps because those queues are usually
empty). There's probably a way using netstat to list the PCB for the
listening server, and then crash(8) to hunt through that ... but ...

However, I am less sure that will be the cause of your problem now.
I did a (not very thorough) test (I enabled the telnet server, then
kill _STOP'd inetd.. so it couldn't accept anything). It appears as
if when the queue is full, incoming connection packets are simply discarded.
This makes some sense, as it should cause the peer to retry in a couple of
seconds, which in normal circumstances (some idiot hasn't kill -STOP'd inetd)
would usually give a good chance of success. But the effect is, it seems,
that connections would time out, instead of being reset, in this situation.

kre


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-05-09 12:17:51 UTC
Permalink
Post by Edgar Fuß
We have two production web servers, both running lighttpd, the first still
running on NetBSD 4.0.1, the second on 6.1. Occasioally, on the second, you
get Connection Reset by Peer (NOT Connection Refused) on the HTTP port.
Meanwhile, both servers are on 6.1 and now both exhibit the problem.

In the broken state, according to netstat, there are ~190 sockets in the
ESTABLISHED state with a local port number of 80, while the lighttpd process
(the only one listening on port 80) has ~no file descriptors open.

So, next question: Is it legally possible for a user process to be responsible
for a socket being in the ESTABLISHED state without having an fd open?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 01:52:09 UTC
Permalink
according to netstat, there are ~190 sockets in the ESTABLISHED state
with a local port number of 80, while the lighttpd process
(the only one listening on port 80) has ~no file descriptors open.
I'm working with the same servers and have analyzed the issue.

It turned out that lighttpd uses accf_http(9); the server process never
even notices incoming connections to accept until ``a complete,
syntactically valid HTTP [...] request has been buffered by the kernel
[or] the data [...] cannot be part of a complete [such] request''. Okay.

Now, unless I'm missing something, the accept filter does not seem to
have any sort of timeout mechanism - a TCP connection that has been
opened but is otherwise seeing no traffic whatsoever will be stuck
there until the remote end closes it. If the remote end never closes
it, it's stuck for good. Once what appears to be 193 such connections
are occupying the accept filter, establishing new connections becomes
impossible.

This is easily abused for denial of service purposes(*), as well as a
real-world problem because many web browsers tend to open multiple
connections at once, even if they're using only one of them (Safari has
been observed to open 6 concurrent connections, 5 of them unused, for a
simple PDF download).

Sometimes, those extra connections are not being closed (think a user
waiting for a download to finish, then putting their laptop to sleep
and leaving the (wifi) network.

Therefore, shouldn't there be (or am I just not seeing it?) sort of
a timeout that disposes of connections that have been sitting in the
accept filter for longer than $timespan? Currently, we have to restart
lighttpd every so many days to cope with the slow but steady "leakage"
of connections...

Bests,
Timo Buhrmester


(*) Attached is an ugly POC exploit consisting of a shell script and
a helper program. The idea is to repeatedly open connections with
netcat, then convincing netcat to give up the connections silently
without the other end ever noticing it by faking TCP-RST packets.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-06-01 09:45:01 UTC
Permalink
Post by Timo Buhrmester
That said, what FreeBSD does is, in fact, calling soabort() on
connections that are pushed out of their accept filter.
Is anybody in touch with FreeBSD people? I would like to ask them whether
they are behaving this way for a reason or whether they just haven't
considered accepting the old connection.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Tyler Retzlaff
2016-05-20 02:12:32 UTC
Permalink
Post by Timo Buhrmester
Therefore, shouldn't there be (or am I just not seeing it?) sort of
a timeout that disposes of connections that have been sitting in the
accept filter for longer than $timespan? Currently, we have to restart
lighttpd every so many days to cope with the slow but steady "leakage"
of connections...
if getsockopt(2) SO_KEEPALIVE is set for the peer socket then yes.

tcp(4) provides more detail and timers can be configured via sysctl(8)
net.inet.tcp.keep* .

If I recall (system dependent default) the period before such a
connection will be timed out is very long (hours).

I'd normally expect the application to implement more meaningful timeout
mechanisms given what it is doing/waiting for. I'd definitely expect it
to be tuned for ~seconds rather than hours.

rtr

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Thor Lancelot Simon
2016-05-20 02:51:23 UTC
Permalink
Post by Timo Buhrmester
Now, unless I'm missing something, the accept filter does not seem to
have any sort of timeout mechanism - a TCP connection that has been
opened but is otherwise seeing no traffic whatsoever will be stuck
there until the remote end closes it. If the remote end never closes
it, it's stuck for good. Once what appears to be 193 such connections
are occupying the accept filter, establishing new connections becomes
impossible.
The accept filter isn't really "occupied". It has no local state and
is just called from soisconnected() when an event happens on the socket
(e.g. data is received). Sockets just sit on so->so_q0 for the listen
socket until the accept filter lets them go.

I'm not seeing where your limit of 193 for the length of that queue
is coming from, or I would fix the issue right now. If you see where
it is, the fix should be to dequeue the oldest filtered connection on
the listen socket (adapted from accept_filt_clear, note the extra "break"!):

if (so->so_accf != NULL) {
/* Pass the oldest connection waiting in the accept filter */
for (so2 = TAILQ_FIRST(&so->so_q0); so2 != NULL; so2 = next) {
next = TAILQ_NEXT(so2, so_qe);
if (so2->so_upcall == NULL) {
continue;
}
so2->so_upcall = NULL;
so2->so_upcallarg = NULL;
so2->so_options &= ~SO_ACCEPTFILTER;
so2->so_rcv.sb_flags &= ~SB_UPCALL;
soisconnected(so2);
break;
}
}

If anyone can see where to apply this, please let me know if it works.

Thor

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-05-20 03:03:37 UTC
Permalink
Date: Thu, 19 May 2016 22:12:32 -0400
From: Tyler Retzlaff <***@netbsd.org>
Message-ID: <b6cb0b45-36a2-1036-ee64-***@netbsd.org>

| if getsockopt(2) SO_KEEPALIVE is set for the peer socket then yes.

That would (might) work with some of the situations described (the laptop
that has been turned off) but is useless against an actual DoS attack, and
also against the most likely bug in HTTP clients (particularly new ones).

That is, as long as the remote system is still there with the TCP connection
alive and sends ACKs when requested, keepalive doesn't help at all. Aside
from the deliberate attacker, the obvious way this could happen with
accf_http would be with a client that has "forgotten" that an empty line
is required to finish a GET/HEAD request, and sends what it thinks is
all it needs to send, and awaits a response. The filter hasn't yet received
the complete request, nor is what it has received invalid, so it will not
pass the connection to the application. We have a textbook deadlock, and
no way out (other than user intervention at one end or the other). Sure,
caused by a bug, but aren't they all?

I agree with Timo, these filters should all have (configureable) timeouts
added (and they should default to something rational - for accf_http, about
5 minutes should be plenty - so that apps that have not yet learned how to
set the timeout still benefit). Whether when the timeout goes off the
connection should simply be reset, or be passed to the application should
probably also be configurable [as part of setting the timeout] (with the
default being reset I would expect).

| I'd normally expect the application to implement more meaningful timeout
| mechanisms given what it is doing/waiting for. I'd definitely expect it
| to be tuned for ~seconds rather than hours.

I'd agree, and the mechanism to allow it to do that is what is being
discussed. Currently, at least as best I can see, there is none.

How to add a timeout mechanism (the API would be a new setcockopt() I'd
assume, that's not the issue) at the minute I have no idea though.

In a message I haven't yet quite received enough to quote properly
Thor (tls@) says ..

| I'm not seeing where your limit of 193 for the length of that queue
| is coming from,

Perhaps that's the listen() queue limit in the application he's using?

kre


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 03:15:48 UTC
Permalink
Post by Robert Elz
| I'm not seeing where your limit of 193 for the length of that queue
| is coming from,
Perhaps that's the listen() queue limit in the application he's using?
No, that's 1024 in lighttpd. I'll try to find out where the 193 comes from.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 02:48:35 UTC
Permalink
Post by Tyler Retzlaff
Post by Timo Buhrmester
Therefore, shouldn't there be (or am I just not seeing it?) sort of
a timeout that disposes of connections that have been sitting in the
accept filter for longer than $timespan? Currently, we have to restart
lighttpd every so many days to cope with the slow but steady "leakage"
of connections...
if getsockopt(2) SO_KEEPALIVE is set for the peer socket then yes.
But there is no peer socket yet. The connection is queued in the
kernel's accept filter, waiting to see a complete HTTP request before
the kernel even lets accept(2) return (or in our case, before kqueue
produces the "here's something to accept"-information.)

Or do you mean to SO_KEEPALIVE the listening socket?
Post by Tyler Retzlaff
I'd normally expect the application to implement more meaningful timeout
mechanisms given what it is doing/waiting for. I'd definitely expect it to
be tuned for ~seconds rather than hours.
The application has no way to be aware of the connections sitting in the
accept filter, that's kind of the point of it. For sockets it is actually
aware of, lighttpd certainly does implement timeouts.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 01:55:47 UTC
Permalink
(*) Attached is [...]
it hopefully now.
Michael van Elst
2016-05-20 06:33:58 UTC
Permalink
Post by Thor Lancelot Simon
I'm not seeing where your limit of 193 for the length of that queue
is coming from, or I would fix the issue right now.
That's the sockets listen queue. It can handle a default of SOCONNMAX=128
connections but the check in sonewconn is:

if (head->so_qlen + head->so_q0len > 3 * head->so_qlimit / 2) {
/* Listen queue overflow. */
return NULL;
}

128 * 3 / 2 = 192.
--
--
Michael van Elst
Internet: ***@serpens.de
"A potential Snark may lurk in every tree."

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-23 02:39:07 UTC
Permalink
Post by Thor Lancelot Simon
The attached patch (untested, not even compiled!) may work. I believe it
is strictly fair: the longer a cxn sits idle, sending no valid request,
the more likely it is to be passed through to the application, which may
in turn drop it.
Thanks for this, it essentially works. However while it does remove one
pending socket from q0 by calling soisconnected, it still runs into the
return NULL at the bottom, so that particular attempt to establish a TCP
connection still fails. The next one works then, though.

Since soisconnected makes room on q0, I figure it might as well proceed
allocating the new socket if soisconnected was in fact called on one of the
sockets in the accept filter; below is your patch slightly modified so it
does this. I have tested it and was able to connect(2) until I ran out
of file descriptors.

Cheers,
Timo Buhrmester


diff --git a/sys/kern/uipc_socket2.c b/sys/kern/uipc_socket2.c
index 58335ec..99a6467 100644
--- a/sys/kern/uipc_socket2.c
+++ b/sys/kern/uipc_socket2.c
@@ -262,8 +262,37 @@ sonewconn(struct socket *head, bool soready)
KASSERT(solocked(head));

if (head->so_qlen + head->so_q0len > 3 * head->so_qlimit / 2) {
- /* Listen queue overflow. */
- return NULL;
+ /*
+ * Listen queue overflow. If there is an accept filter
+ * active, pass through the oldest cxn it's handling.
+ */
+ if (head->so_accf == NULL) {
+ return NULL;
+ } else {
+ struct socket *so2, *next;
+
+ /* Pass the oldest connection waiting in the
+ accept filter */
+ for (so2 = TAILQ_FIRST(&head->so_q0);
+ so2 != NULL; so2 = next) {
+ next = TAILQ_NEXT(so2, so_qe);
+ if (so2->so_upcall == NULL) {
+ continue;
+ }
+ so2->so_upcall = NULL;
+ so2->so_upcallarg = NULL;
+ so2->so_options &= ~SO_ACCEPTFILTER;
+ so2->so_rcv.sb_flags &= ~SB_UPCALL;
+ soisconnected(so2);
+ break;
+ }
+
+ /* If nothing was nudged out of the acept filter, bail
+ * out; otherwise proceed allocating the socket. */
+ if (so2 == NULL) {
+ return NULL;
+ }
+ }
}
if ((head->so_options & SO_ACCEPTFILTER) != 0) {
soready = false;

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-05-25 10:44:21 UTC
Permalink
Post by Thor Lancelot Simon
The attached patch (untested, not even compiled!) may work. I believe it
is strictly fair: the longer a cxn sits idle, sending no valid request,
the more likely it is to be passed through to the application, which may
in turn drop it.
This solution looks extremly elegant since it dosn't need any new setsockpot option.
However, it relies on the application being prepared to have nothing to read() after an accept().
I guess that since most applications are written to work without Accept Filters, that assumption is probably OK.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2017-06-14 09:00:29 UTC
Permalink
TLS> The attached patch (untested, not even compiled!) may work.
TB> Thanks for this, it essentially works.
TB> below is your patch slightly modified
Timo just told me that TLS has commited this modified patch about a year ago
(sys/kern/uipc_socket2.c 1.123) without either of us noticing. Thanks.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Michael van Elst
2016-05-20 06:37:02 UTC
Permalink
Post by Timo Buhrmester
Post by Robert Elz
| I'm not seeing where your limit of 193 for the length of that queue
| is coming from,
Perhaps that's the listen() queue limit in the application he's using?
No, that's 1024 in lighttpd. I'll try to find out where the 193 comes from.
listen queue is max(backlog, somaxconn). So your 1024 is reduced to 128.

somaxconn is a tunable, but there is no sysctl so far.
--
--
Michael van Elst
Internet: ***@serpens.de
"A potential Snark may lurk in every tree."

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-05-20 08:02:16 UTC
Permalink
Date: Fri, 20 May 2016 06:37:02 +0000 (UTC)
From: ***@serpens.de (Michael van Elst)
Message-ID: <nhmbae$bb4$***@serpens.de>

| listen queue is max(backlog, somaxconn). So your 1024 is reduced to 128.

min(backlog, somaxconn) (from kern/uipc_socket.c) -- max() would never
reduce anything, but min() does.

kre


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Michael van Elst
2016-05-20 10:26:29 UTC
Permalink
Post by Robert Elz
Date: Fri, 20 May 2016 06:37:02 +0000 (UTC)
| listen queue is max(backlog, somaxconn). So your 1024 is reduced to 128.
min(backlog, somaxconn) (from kern/uipc_socket.c) -- max() would never
reduce anything, but min() does.
right. copy&paste would have been better :)
--
--
Michael van Elst
Internet: ***@serpens.de
"A potential Snark may lurk in every tree."

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-05-20 08:33:36 UTC
Permalink
Date: Thu, 19 May 2016 22:51:23 -0400
From: Thor Lancelot Simon <***@panix.com>
Message-ID: <***@panix.com>

Now that Thor's mail (quote a while ago, I deferred sending this)
has really arrived (don't ask!) ...

| The accept filter isn't really "occupied". It has no local state and
| is just called from soisconnected() when an event happens on the socket
| (e.g. data is received). Sockets just sit on so->so_q0 for the listen
| socket until the accept filter lets them go.

Since, thanks to Michael van Elst, we now know this is almost certainly the
issue perhaps that is where a timeout needs to be in general. Nothing should
live on that queue for more than a few minutes, ever - whether the cause
is just a buggy server that isn't bothering to accept() when it could,
or a filter preventing the accept receiving the connection, nothing should
ever be left in limbo on the listen queue for very long.

| If you see where it is, the fix should be to dequeue the oldest
| filtered connection on the listen socket

I don't think I'd do it that way. 99% of applications that are filtering
incoming connections are not going to simply want to delay them to a point
where they then have to delay again.

That is, if a filter is waiting for data (accf_data) and no data has
arrived, then all the application can do is wonder why the filter failed,
and wait for data itself. Obviously it can set a timeout and abort the
connection after a while - but it has no way of knowing that the connection
has already been queued in the kernel for hours (or how many hours, it might
be just fractions of a second if the incoming connection request rate is very
high and the queue fills quickly.)

Similarly for an incomplete but not invalid http request via accf_http

In those cases (which is currently, I think, all cases) the better solution
is just to reset the connection inside the kernel, so that the appliction
never discovers it was ever attempted. As I understand it, by this
stage it is too late to simply ignore it as we would do if the listen
queue is full when the SYN request arrives (encouraging the client to
try again in a few seconds, by which time we hope that the queue will
have drained.)

In the case of a server that is broken, and not accepting connections, that's
also clearly what is needed.

But if a new (presumed new) setcockopt() was added that allows a server to
control the timeout for filtered connections, it should also allow it to
decide between reject and accept old pending connections - in case it wants
to log them, or take more drastic counter measures in the event it appears
as if it might be an actual attack attempt (like installing a bpf filter
to block packets from the source of repeated attempts).

I assume it ought be possible to use the packet arrival time in the mbuf
header to work out how long a connection has been pending in the queue ?

kre




--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-05-25 16:04:51 UTC
Permalink
Date: Wed, 25 May 2016 15:31:21 +0200
From: Edgar =?iso-8859-1?B?RnXf?= <***@math.uni-bonn.de>
Message-ID: <***@gumme.math.uni-bonn.de>

| > In the case of a server that is broken, and not accepting connections, that's
| > also clearly what is needed.
| But without an Accept Filter, those connection requests would also
| just fill the queue?

Yes, that was the point, the best thing to do when the server is
(for whatever reason) refusing to accept connections is to reject
attempts to make them, just as if the server was not there at all
(which it, effectively, isn't.)

| I thing tls@'s approach would make all this possible at no cost,
| wouldn't it?

It certainly allows the server to deal with the connection as it sees
fit, and in the "ordinary" case (the one that provoked this examination)
is likely to work just OK - that is, where there are just occasional
lost connections that gradually build up in the queue because of the
filter, though even there I'd like to observe what happens when this state
is reached, with the 192 connections clogging the queue, and new connections
still arriving - it would probably be better to clear the queue of all the
backlogged crud rather than removing one at a time - while also ignoring
the incoming request that provoked it.

I suspect it might easily happen that by the time the client that is
making the new connection tries again, some other connection will have
occupied the (one) free queue slot, and is working its way to satisfying
the filter - thus causing another old connection to be bumped to the
server, and ignoring the incoming request, again .. perhaps over and over
again.

I also suspect that it kind of defeats the principle objective of the
filters - that is, to lower overhead for really busy servers. It really
doesn't make a lot of difference for servers that only process a few
connections a second - whether the server, or the kernel, processes the
initial data should not make that much difference. But with a very busy
server, receiving perhaps thousands of new requests a second, the listen
queue is going to fill very quickly, without allowing time for the filter
to actually do its job. Clearly the server can then go ahead and process
things itself, but this is defeating the purpose of the filter which is
precisely to allow the server to ignore the connection (consume no resources)
until it has real work to do, which it then ought to be able to complete
very quickly and go away - passing through the connections when the only
reason it is being done is because the load is very high, also seems wrong
to me (that is, I would expect that if filters are in use, sufficient time
should be allowed (a couple of seconds at least) before the connection
(if the filter hasn't finished of course) is ever a candidate for handing
off to the server. The current fix doesn't allow for that. When this is
the cause of the queue overflow, the best thing to do is actually to simply
ignore incoming connection requests until the queue level drops (the same
as happens when there is no filter and the queue overflows because of
connections arriving faster than the server can cope over some burst.)

It is certainly a cheap change, and an easy one, but I doubt it is really the
best way. Obviously better than nothing. But I believe that to
handle this properly, the code needs to have some concept of time, and
use it.

kre


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-05-25 16:25:59 UTC
Permalink
while also ignoring the incoming request that provoked it.
Timo's patch to the patch specifically tries to avoid that, no?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-25 18:06:14 UTC
Permalink
Post by Edgar Fuß
while also ignoring the incoming request that provoked it.
Timo's patch to the patch specifically tries to avoid that, no?
It does, but Robert is right in that eventually, when the queue is
full, it remains full (drop one from q0, add one to q0, ad inf.), so
the accept filter is for all practical purposes defeated.

I wonder if instead of dropping *one* socket in the case of an
overflow, q0 should be iterated and purged of *all* sockets that have
been on it for longer than $time. That would require the sockets
to only know the time they were created, without requiring periodic
housekeeping with a timer or something.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-05-25 18:33:00 UTC
Permalink
eventually, when the queue is full, it remains full (drop one from q0,
add one to q0, ad inf.), so the accept filter is for all practical
purposes defeated.
Yes, right (unless the TCP connection is dropped eventually).
I wonder if instead of dropping *one* socket in the case of an
overflow, q0 should be iterated and purged of *all* sockets that have
been on it for longer than $time.
For what value of "purged"? Connect or drop?
Perhaps connect the top third or half of the queue? Or set a flag that will
continue to connect two old sockets for every new one until the queue is one
third/half empty again?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-25 20:10:04 UTC
Permalink
Post by Edgar Fuß
Post by Timo Buhrmester
I wonder if instead of dropping *one* socket in the case of an
overflow, q0 should be iterated and purged of *all* sockets that have
been on it for longer than $time.
For what value of "purged"? Connect or drop?
I tend towards "drop"; that's mainly because I can't imagine(*) a situation
in which a client would legitimately connect and then uselessly linger
around for a substantial amount of time before sending their request.
Can anyone else, for any protocol in which the client is supposed to
be the first to send data?

(*) Actually, I can imagine two:
1. A human manually issuing a request using (line-buffered) telnet or
netcat. This could be countered by making the decision to drop or to
accept depend on whether /any/ data has been seen on the socket so far
(like half a HTTP request).
2. The extra connections web browsers seem to open. If however they
are not used to make actual requests for a while, one could arguably
consider them (practically) defective and it wouldn't hurt to RST them.
The browser is going to open new useless connections ones anyway.


I believe that a "maximum linger time" (in lack of a better term and
unrelated to SO_LINGER) as short as, say, 3 minutes would have a high
success rate at dropping de-facto dead connections while providing
adequate room for legitimate, but *really* slow, connections.

Now if the queue is full of only sockets that have been there for
*less* than those 3 minutes (like in a very busy server), I would
force one (or multiple) to be accepted rather than dropped, otherwise
it's pretty simple to DoS again.
Post by Edgar Fuß
Or set a flag that will continue to connect two old sockets for
every new one until the queue is one third/half empty again?
That sounds like a good idea too, I'd expect it to have good
"dynamics" regardless of whether one's dealing with a mostly idle
or mostly busy server.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-05-28 20:27:40 UTC
Permalink
Post by Timo Buhrmester
It does, but Robert is right in that eventually, when the queue is
full, it remains full (drop one from q0, add one to q0, ad inf.), so
the accept filter is for all practical purposes defeated.
I disagree.
If q0 is full, then every new connection will be inserted at the "new" end
and push out another one at the "old" end. If there's more pressure on q0,
more of the old entries will be pushed out. Provided you have substantially
more real connections (i.e. those that will send a full request in a short
time) than dead ones (i.e. those that will linger in the Accept Filter
indefinetely), in the long run, you will get exactly what you need: just
enough free entries on q0 to have new connections stay in the Accept Filter
as short as they need to.
Of course, the rest of q0 will be full of garbage, but that doesn't do any
harm, does it?
If you have more dead connections that real ones, you're in trouble no matter
what you do.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-31 21:45:56 UTC
Permalink
Post by Edgar Fuß
Post by Timo Buhrmester
It does, but Robert is right in that eventually, when the queue is
full, it remains full (drop one from q0, add one to q0, ad inf.), so
the accept filter is for all practical purposes defeated.
I disagree.
If q0 is full, then every new connection will be inserted at the "new" end
and push out another one at the "old" end. If there's more pressure on q0,
more of the old entries will be pushed out. Provided you have substantially
more real connections (i.e. those that will send a full request in a short
time) than dead ones (i.e. those that will linger in the Accept Filter
indefinetely), in the long run, you will get exactly what you need: just
enough free entries on q0 to have new connections stay in the Accept Filter
as short as they need to.
Edgar and I have discussed this in person and I agree now, other than
cluttering up what netstat reports, the "dead" connections should cause
no harm with tls@'s approach (and my modification to it).

We also believe it is not too late to document that accept filters
might, under special (i.e. these) circumstances, pass through sockets
that would under normal conditions be retained. Accept filters are
young enough a concept so it seems unlikely that there's any major
software that inherently depends on accept filters (as opposed to using
them opportunistically and hence being prepared to handle "dead"
connections anyway.) Or is there?

That said, what FreeBSD does is, in fact, calling soabort() on
connections that are pushed out of their accept filter. While this
does preserve the filter's semantics for accept(2) to only ever
return sockets that can be read(2), it keeps the service from ever
noticing anything about the situation, which might be desirable for,
say, telling blacklistd about it.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-06-01 10:01:21 UTC
Permalink
the best thing to do when the server is (for whatever reason) refusing to
accept connections is to reject attempts to make them
Yes. The question is whether the kernel should drop them or pass them to the
application which then drops them (after logging or taking counter-measures).
the 192 connections clogging the queue, and new connections still arriving
- it would probably be better to clear the queue of all the backlogged crud
rather than removing one at a time
I also felt that it would be preferrable to clean the queue, but I couldn't
make any rational argument to support that feeling.
while also ignoring the incoming request that provoked it.
Why that?
I suspect it might easily happen that by the time the client that is
making the new connection tries again, some other connection will have
occupied the (one) free queue slot, and is working its way to satisfying
the filter - thus causing another old connection to be bumped to the
server, and ignoring the incoming request, again .. perhaps over and over
again.
But that would mean that going-to-be-dead connections arrive at a similar
rate as ordinary ones. I would say that, if that is the case, the application
is in serious trouble no matter what you do.
With the Timo variant of the tls approach, the new connections would have a
chance to get passed to the application. They simply push out the older ones
out of q0.
But I believe that to handle this properly, the code needs to have some
concept of time, and use it.
Isn't relative time (i.e. queue position) enough?

Can you make up a scenario where the tls/Timo solution doesn't work well
but there still is a way to deal with it better that doesn't require looking
into the future?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Elz
2016-06-01 11:50:51 UTC
Permalink
Date: Wed, 1 Jun 2016 12:01:21 +0200
From: Edgar =?iso-8859-1?B?RnXf?= <***@math.uni-bonn.de>
Message-ID: <***@gumme.math.uni-bonn.de>

| But that would mean that going-to-be-dead connections arrive at a similar
| rate as ordinary ones.

They are not necessarily "going-to-be-dead" - all we know is that the
queue is full. Wait a few seconds, and all of the pending connections
may have satisfied the filter and be delivered.

| I would say that, if that is the case, the application
| is in serious trouble no matter what you do.

The system is overloaded. That's the primary function of the filters,
to help in that situation. When there's low load they're not needed
(the application can handle things just fine.)

| Isn't relative time (i.e. queue position) enough?

No. Because how that relates to the real world depends upon the
rate at which connections are arriving, and the RTT to the source,
neither of which is meaningful without considering real time.

| Can you make up a scenario where the tls/Timo solution doesn't work well
| but there still is a way to deal with it better that doesn't require
| looking into the future?

Sure, if connections are arriving too fast for the queue to process, the
best solution is to simply drop (ignore) incoming connections as happened
previously. The problem only occurs because the queue became clogged
with stuff that was never going away - it is only those connections
(connection attempts) that need attention, anything that is just in the
queue because it hasn't had time yet to have satisfied the filter should
be left alone - those connections have been SYN/ACK'd already, later
incoming ones attempting to get on the queue haven't, if those are dropped
the source sees (effectively) just a lost packet and will retry soon enough.

When the problem is just a temporary sudden burst of requests, rather than
a sustained overload, that allows everyone to be handled, without adding
excess load to the system.

But it does require knowing (which I suspect that we do already) how long
the request has been in the queue - anything that's been there < about 5
seconds should be left alone, always, anything that's been there > about
5 minutes should simply be discarded (aborted).

An additional problem with giving the old crud to the application is that
it gets no notification at all as to how long the request has been there.
All it can do when it receives the result from accept() is to start a new
timer and wait (even longer) for the full request to arrive - all totally
pointless. The only possible use of sending a request that has failed to
satisfy a filter to the application is for logging (and similar measures).
Using accept() for that isn't really the best way, there ought to be a
way to notify something (not necessarily the application in question, it
usually doesn't care, it is the system admin who wants to know) that
there are problems - probably via some new (not currently invented) mechanism.

That can log the events if desired, and if it sees enough from one source
that it starts to look like an attack, then take counter measures.

Please do remember here hat this is one single mechanism that has to cope
with lots of different situations, from normal connection requests,
overload bursts, sustained overload, DoS attacks, broken clients that
don't send what they should, and broken servers that don't accept connections.
The kernel part of the solution has to work reasonably for all of those,
while at the same time also acting reasonably as seen by the clients
(other than attackers.)

kre


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-06-02 09:34:52 UTC
Permalink
Ah thanks, I now understand.

So on the extreme ends, we have two scenarios leading to q0 overflow:
A) slow aggregation of dead connections
B) temporary overload
(If you omit "slow" from A or "temporary" from B, you're in trouble.)

With A, it's fine to move an old entry from q0 to q and attach the new one
to q0.
With B, it's best to drop the new connection attempt; probably drop parts
of q0, too.

Couldn't the size of q (at least after exercising the "A" solution for a
short while) be taken to tell between the two? I.e. accept old/queue new
if q is empty, drop new (and drop parts of q0) if q is full?
A connection burst would then initially be mistaken as a slow aggregation
(q0 full, q empty), but only until the "wrong" decision to accept would fill
q too, which would then automatically switch to the appropriate behaviour.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Tyler Retzlaff
2016-05-20 12:16:40 UTC
Permalink
Post by Timo Buhrmester
Post by Tyler Retzlaff
Post by Timo Buhrmester
Therefore, shouldn't there be (or am I just not seeing it?) sort of
a timeout that disposes of connections that have been sitting in the
accept filter for longer than $timespan? Currently, we have to restart
lighttpd every so many days to cope with the slow but steady "leakage"
of connections...
if getsockopt(2) SO_KEEPALIVE is set for the peer socket then yes.
But there is no peer socket yet. The connection is queued in the
kernel's accept filter, waiting to see a complete HTTP request before
the kernel even lets accept(2) return (or in our case, before kqueue
produces the "here's something to accept"-information.)
Or do you mean to SO_KEEPALIVE the listening socket?
No, this is my mistake I completely ignored that you were talking about
accf_http(9) sorry for the noise.

rtr

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 14:11:09 UTC
Permalink
Post by Timo Buhrmester
Post by Robert Elz
Perhaps that's the listen() queue limit in the application he's using?
No, that's 1024 in lighttpd. I'll try to find out where the 193 comes from.
It turns out the number of possible connections does somewhat depend on
the listen backlog after all. I wrote the attached program to find out
what the correlation is, the result is:

| accftest: established 2 connections before it blew up, backlog was 1
| accftest: established 4 connections before it blew up, backlog was 2
| accftest: established 5 connections before it blew up, backlog was 3
| accftest: established 7 connections before it blew up, backlog was 4
| accftest: established 8 connections before it blew up, backlog was 5
| accftest: established 10 connections before it blew up, backlog was 6
| accftest: established 11 connections before it blew up, backlog was 7
| accftest: established 13 connections before it blew up, backlog was 8
| accftest: established 14 connections before it blew up, backlog was 9
| accftest: established 16 connections before it blew up, backlog was 10
| accftest: established 17 connections before it blew up, backlog was 11
| accftest: established 19 connections before it blew up, backlog was 12
| accftest: established 20 connections before it blew up, backlog was 13
| accftest: established 22 connections before it blew up, backlog was 14
| accftest: established 23 connections before it blew up, backlog was 15
| accftest: established 25 connections before it blew up, backlog was 16
| accftest: established 26 connections before it blew up, backlog was 17
| accftest: established 28 connections before it blew up, backlog was 18
| accftest: established 29 connections before it blew up, backlog was 19
| accftest: established 31 connections before it blew up, backlog was 20
| accftest: established 32 connections before it blew up, backlog was 21
| accftest: established 34 connections before it blew up, backlog was 22
| accftest: established 35 connections before it blew up, backlog was 23
| accftest: established 37 connections before it blew up, backlog was 24
| accftest: established 38 connections before it blew up, backlog was 25
| accftest: established 40 connections before it blew up, backlog was 26
| accftest: established 41 connections before it blew up, backlog was 27
| accftest: established 43 connections before it blew up, backlog was 28
| accftest: established 44 connections before it blew up, backlog was 29
| accftest: established 46 connections before it blew up, backlog was 30
| accftest: established 47 connections before it blew up, backlog was 31
| accftest: established 49 connections before it blew up, backlog was 32
| accftest: established 50 connections before it blew up, backlog was 33
| accftest: established 52 connections before it blew up, backlog was 34
| accftest: established 53 connections before it blew up, backlog was 35
| accftest: established 55 connections before it blew up, backlog was 36
| accftest: established 56 connections before it blew up, backlog was 37
| accftest: established 58 connections before it blew up, backlog was 38
| accftest: established 59 connections before it blew up, backlog was 39
| accftest: established 61 connections before it blew up, backlog was 40
| accftest: established 62 connections before it blew up, backlog was 41
| accftest: established 64 connections before it blew up, backlog was 42
| accftest: established 65 connections before it blew up, backlog was 43
| accftest: established 67 connections before it blew up, backlog was 44
| accftest: established 68 connections before it blew up, backlog was 45
| accftest: established 70 connections before it blew up, backlog was 46
| accftest: established 71 connections before it blew up, backlog was 47
| accftest: established 73 connections before it blew up, backlog was 48
| accftest: established 74 connections before it blew up, backlog was 49
| accftest: established 76 connections before it blew up, backlog was 50
| accftest: established 77 connections before it blew up, backlog was 51
| accftest: established 79 connections before it blew up, backlog was 52
| accftest: established 80 connections before it blew up, backlog was 53
| accftest: established 82 connections before it blew up, backlog was 54
| accftest: established 83 connections before it blew up, backlog was 55
| accftest: established 85 connections before it blew up, backlog was 56
| accftest: established 86 connections before it blew up, backlog was 57
| accftest: established 88 connections before it blew up, backlog was 58
| accftest: established 89 connections before it blew up, backlog was 59
| accftest: established 91 connections before it blew up, backlog was 60
| accftest: established 92 connections before it blew up, backlog was 61
| accftest: established 94 connections before it blew up, backlog was 62
| accftest: established 95 connections before it blew up, backlog was 63
| accftest: established 97 connections before it blew up, backlog was 64
| accftest: established 98 connections before it blew up, backlog was 65
| accftest: established 100 connections before it blew up, backlog was 66
| accftest: established 101 connections before it blew up, backlog was 67
| accftest: established 103 connections before it blew up, backlog was 68
| accftest: established 104 connections before it blew up, backlog was 69
| accftest: established 106 connections before it blew up, backlog was 70
| accftest: established 107 connections before it blew up, backlog was 71
| accftest: established 109 connections before it blew up, backlog was 72
| accftest: established 110 connections before it blew up, backlog was 73
| accftest: established 112 connections before it blew up, backlog was 74
| accftest: established 113 connections before it blew up, backlog was 75
| accftest: established 115 connections before it blew up, backlog was 76
| accftest: established 116 connections before it blew up, backlog was 77
| accftest: established 118 connections before it blew up, backlog was 78
| accftest: established 119 connections before it blew up, backlog was 79
| accftest: established 121 connections before it blew up, backlog was 80
| accftest: established 122 connections before it blew up, backlog was 81
| accftest: established 124 connections before it blew up, backlog was 82
| accftest: established 125 connections before it blew up, backlog was 83
| accftest: established 127 connections before it blew up, backlog was 84
| accftest: established 128 connections before it blew up, backlog was 85
| accftest: established 130 connections before it blew up, backlog was 86
| accftest: established 131 connections before it blew up, backlog was 87
| accftest: established 133 connections before it blew up, backlog was 88
| accftest: established 134 connections before it blew up, backlog was 89
| accftest: established 136 connections before it blew up, backlog was 90
| accftest: established 137 connections before it blew up, backlog was 91
| accftest: established 139 connections before it blew up, backlog was 92
| accftest: established 140 connections before it blew up, backlog was 93
| accftest: established 142 connections before it blew up, backlog was 94
| accftest: established 143 connections before it blew up, backlog was 95
| accftest: established 145 connections before it blew up, backlog was 96
| accftest: established 146 connections before it blew up, backlog was 97
| accftest: established 148 connections before it blew up, backlog was 98
| accftest: established 149 connections before it blew up, backlog was 99
| accftest: established 151 connections before it blew up, backlog was 100
| accftest: established 152 connections before it blew up, backlog was 101
| accftest: established 154 connections before it blew up, backlog was 102
| accftest: established 155 connections before it blew up, backlog was 103
| accftest: established 157 connections before it blew up, backlog was 104
| accftest: established 158 connections before it blew up, backlog was 105
| accftest: established 160 connections before it blew up, backlog was 106
| accftest: established 161 connections before it blew up, backlog was 107
| accftest: established 163 connections before it blew up, backlog was 108
| accftest: established 164 connections before it blew up, backlog was 109
| accftest: established 166 connections before it blew up, backlog was 110
| accftest: established 167 connections before it blew up, backlog was 111
| accftest: established 169 connections before it blew up, backlog was 112
| accftest: established 170 connections before it blew up, backlog was 113
| accftest: established 172 connections before it blew up, backlog was 114
| accftest: established 173 connections before it blew up, backlog was 115
| accftest: established 175 connections before it blew up, backlog was 116
| accftest: established 176 connections before it blew up, backlog was 117
| accftest: established 178 connections before it blew up, backlog was 118
| accftest: established 179 connections before it blew up, backlog was 119
| accftest: established 181 connections before it blew up, backlog was 120
| accftest: established 182 connections before it blew up, backlog was 121
| accftest: established 184 connections before it blew up, backlog was 122
| accftest: established 185 connections before it blew up, backlog was 123
| accftest: established 187 connections before it blew up, backlog was 124
| accftest: established 188 connections before it blew up, backlog was 125
| accftest: established 190 connections before it blew up, backlog was 126
| accftest: established 191 connections before it blew up, backlog was 127
| accftest: established 193 connections before it blew up, backlog was 128
| accftest: established 193 connections before it blew up, backlog was 129
| accftest: established 193 connections before it blew up, backlog was 130
| accftest: established 193 connections before it blew up, backlog was 131
| accftest: established 193 connections before it blew up, backlog was 132
| accftest: established 193 connections before it blew up, backlog was 133
| accftest: established 193 connections before it blew up, backlog was 134
| accftest: established 193 connections before it blew up, backlog was 135
| accftest: established 193 connections before it blew up, backlog was 136
| accftest: established 193 connections before it blew up, backlog was 137
| accftest: established 193 connections before it blew up, backlog was 138
| accftest: established 193 connections before it blew up, backlog was 139
| [tops off at 193]
Thor Lancelot Simon
2016-05-20 14:12:04 UTC
Permalink
Post by Michael van Elst
Post by Thor Lancelot Simon
I'm not seeing where your limit of 193 for the length of that queue
is coming from, or I would fix the issue right now.
That's the sockets listen queue. It can handle a default of SOCONNMAX=128
Gah, I missed the math there. Of course.

The attached patch (untested, not even compiled!) may work. I believe it
is strictly fair: the longer a cxn sits idle, sending no valid request,
the more likely it is to be passed through to the application, which may
in turn drop it.

Thor

Index: uipc_socket2.c
===================================================================
RCS file: /cvsroot/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.119
diff -u -p -r1.119 uipc_socket2.c
--- uipc_socket2.c 19 May 2014 02:51:24 -0000 1.119
+++ uipc_socket2.c 20 May 2016 14:09:09 -0000
@@ -260,7 +260,29 @@ sonewconn(struct socket *head, bool sore
KASSERT(solocked(head));

if (head->so_qlen + head->so_q0len > 3 * head->so_qlimit / 2) {
- /* Listen queue overflow. */
+ /*
+ * Listen queue overflow. If there is an accept filter
+ * active, pass through the oldest cxn it's handling.
+ */
+ if (head->so_accf != NULL) {
+ struct socket *so2, *next;
+
+ /* Pass the oldest connection waiting in the
+ accept filter *
+ for (so2 = TAILQ_FIRST(&head->so_q0);
+ so2 != NULL; so2 = next) {
+ next = TAILQ_NEXT(so2, so_qe);
+ if (so2->so_upcall == NULL) {
+ continue;
+ }
+ so2->so_upcall = NULL;
+ so2->so_upcallarg = NULL;
+ so2->so_options &= ~SO_ACCEPTFILTER;
+ so2->so_rcv.sb_flags &= ~SB_UPCALL;
+ soisconnected(so2);
+ break;
+ }
+ }
return NULL;
}
if ((head->so_options & SO_ACCEPTFILTER) != 0) {

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 14:13:33 UTC
Permalink
Sorry, wrong program attached. Here's the right one.

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <arpa/inet.h>
#include <unistd.h>
#include <sys/socket.h>
#include <err.h>

#define LISTENADDR "192.168.1.12"
#define LISTENPORT 12345

void
do_listen(int backlog)
{
int lsck = socket(AF_INET, SOCK_STREAM, 0);
if (lsck == -1)
err(1, "socket");

struct sockaddr_in sa = {
.sin_len = sizeof sa,
.sin_family = AF_INET,
.sin_port = htons(LISTENPORT)
};

int r = inet_pton(AF_INET, LISTENADDR, &sa.sin_addr);
if (r != 1)
errx(1, "inet_pton: %d", r);

if (bind(lsck, (struct sockaddr *)&sa, sizeof sa) == -1)
err(1, "bind");

if (listen(lsck, backlog) == -1)
err(1, "listen");

//warnx("listening, backlog %d", backlog);

struct accept_filter_arg accf;
memset(&accf, 0, sizeof accf);
strcpy(accf.af_name, "httpready");
if (setsockopt(lsck, SOL_SOCKET, SO_ACCEPTFILTER, &accf, sizeof accf) == -1)
err(1, "setsockopt");

int sck;
while ((sck = accept(lsck, NULL, NULL)) != -1) {
//warnx("accepted");

r = fork();
if (r == -1)
err(1, "fork");

if (r == 0) {
char buf[128];
ssize_t ret = read(sck, buf, sizeof buf);
//if (ret <= 0)
// warnx("read: %zd", ret);
//else
// printf("read '%s'\n", buf);

close(sck);
exit(0);
} else
close(sck);
}

err(1, "accept");

}

void
do_connect(int backlog)
{
int conns = 0;
struct sockaddr_in sa = {
.sin_len = sizeof sa,
.sin_family = AF_INET,
.sin_port = htons(LISTENPORT)
};

int r = inet_pton(AF_INET, LISTENADDR, &sa.sin_addr);
if (r != 1)
errx(1, "inet_pton: %d", r);


while (true) {
int sck = socket(AF_INET, SOCK_STREAM, 0);
if (sck == -1)
err(1, "socket");

if (connect(sck, (struct sockaddr *)&sa, sizeof sa) == -1) {
warn("connect");
break;
}

conns++;
}

warnx("established %d connections before it blew up, backlog was %d", conns, backlog);

system("pkill accftest"); // ...
}

int
main(int argc, char **argv)
{
int backlog = strtol(argv[1], NULL, 0);
int r = fork();

if (r == -1)
err(1, "fork");

if (r == 0) {
sleep(1);
do_connect(backlog);
} else {
do_listen(backlog);
}
}


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Timo Buhrmester
2016-05-20 14:20:55 UTC
Permalink
Just noticed I've been beaten to it already, never mind.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Edgar Fuß
2016-05-25 13:31:21 UTC
Permalink
Post by Robert Elz
In the case of a server that is broken, and not accepting connections, that's
also clearly what is needed.
But without an Accept Filter, those connection requests would also just fill the queue?
Post by Robert Elz
it should also allow it to decide between reject and accept old pending
connections - in case it wants to log them, or take more drastic counter
measures in the event it appears as if it might be an actual attack attempt
Yes.

I thing tls@'s approach would make all this possible at no cost, wouldn't it?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...