Discussion:
Unsigned wraparound on window size calculations
(too old to reply)
Valery Ushakov
2018-04-18 14:44:22 UTC
Permalink
Hi,

tcp_output.c has a bug when calculating advertised window size after
we have successfully accepted a zero-window probe.

I would like to commit the following change (unified and context
diffs, the latter might be more readbale):

http://www.stderr.spb.ru/~uwe/netbsd/tcp_output-window.diff
http://www.stderr.spb.ru/~uwe/netbsd/tcp_output-window-ctx.diff



I have been sitting on this change for a few years and I almost
completely forgot about it. I'd rather commit it before it's lost.

I ran into this problem with the old copy of the BSD stack in slirp in
VirtualBox. I have fixed it with:

https://www.virtualbox.org/changeset?reponame=vbox&new=51904%40trunk%2Fsrc%2FVBox%2FDevices%2FNetwork%2Fslirp%2Ftcp_output.c&old=44528%40trunk%2Fsrc%2FVBox%2FDevices%2FNetwork%2Fslirp%2Ftcp_output.c


The second part of that fix is present in NetBSD tree and it fixes the
worst of the two problems.

revision 1.112
date: 2004-05-08 18:41:47 +0400; author: chs; state: Exp; lines: +4 -4;
work around an LP64 problem where we report an excessively large window
due to incorrect mixing of types.

The first part is not present, but its impact is probably not very
severe, which is probably why I dropped the ball on that one.



The summary of the problem:

rcv_nxt is "owned" by (updated by) tcp_input(). It's the next
sequence number we expect to receive.

rcv_adv is "owned" by tcp_output(). It's the right(most) side of the
window we advertise to the sender:

rcv_nxt is almost never ahead of rcv_adv, except in one case, when the
peer sends us a window probe after we advertised zero window (rcv_nxt
== rcv_adv) and the probe is acceted. rcv_nxt will be one byte ahead
of rcv_adv causing unsigned wraparound in (tp->rcv_adv - tp->rcv_nxt).


While that uncommitted part of the fix was gathering dust, christos@
has pulled a change from FreeBSD that touches the affected code
fragment. Unfortunately, he didn't pay attention to the fix for this
problem that FreeBSD does have just above the imported change.

revision 1.182
date: 2015-04-27 19:50:17 +0300; author: christos; state: Exp; lines: +10 -2\
;
Apply Revision 220794 from FreeBSD to avoid dup ACKs:

When checking to see if a window update should be sent to the remote peer,
don't force a window update if the window would not actually grow due to
window scaling. [...]



The relevant place in the FreeBSD code is:

https://svnweb.freebsd.org/base/head/sys/netinet/tcp_output.c?revision=332379&view=markup#l631

Their code there is still a bit obfuscated to my taste (with oldwin
being subtracted and added back). I understand how it's got that way
through incremental change, but I'd prefer to fix it for clarity.

While here I've also incorporated FreeBSD r306768 which is a followup
to the change that christos@ has imported:

https://svnweb.freebsd.org/base/head/sys/netinet/tcp_output.c?r1=306767&r2=306768&diff_format=u


The original change in the VBox tree has been tested and has been in
use for the last few years. This new version is only compile tested.
I hope I didn't screw up when doing variable renames and related
refactoring to unobfuscate the update check.

Comments are appreciated. TIA.

-uwe

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Valery Ushakov
2018-04-18 17:03:46 UTC
Permalink
Post by Valery Ushakov
tcp_output.c has a bug when calculating advertised window size after
we have successfully accepted a zero-window probe.
[...]
Post by Valery Ushakov
The second part of that fix is present in NetBSD tree and it fixes the
worst of the two problems.
revision 1.112
date: 2004-05-08 18:41:47 +0400; author: chs; state: Exp; lines: +4 -4;
work around an LP64 problem where we report an excessively large window
due to incorrect mixing of types.
Forgot to mention.

The fix above was replacing (long)(a - b) with (long)(int32_t)(a - b)

We also have in tcp_seq.h a macro:

#define SEQ_SUB(a,b) ((long)((a)-(b)))

that has the same bug. We've got it in

revision 1.12
date: 1998-10-05 01:33:53 +0400; author: matt; state: Exp; lines: +4 -2;
branches: 1.12.46;
Adapt the NEWRENO changes from the UCSB diffs of BSDI 3.0's TCP
to NetBSD. Ignore the SACK & FACK stuff for now.

and it's used in one place in tcp_congctl.c. The macro probably needs
the same fix.

We might also want to use explicit "int32_t" instead of "int" in other
SEQ_* macros.

-uwe

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...