Discussion:
Trimming TCP options
(too old to reply)
Mihai Chelaru
2010-12-29 21:57:29 UTC
Permalink
Hi,

I'd like to drop paddings for wscale, sack, tcpsig and timestamps in syn
options.

Currently our TCP code is padding every option to 32bit, although this
is not required by RFC793 or other standards except a special case
for timestamps as far as I know. For example on NetBSD 5.1(RC2) at a
simple telnet we end up sending SYNs with 4 consecutive nop options and
for 4 options we also send 5 nops:

<mss 1300,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>

After this change, options will look like:
<mss 1300,wscale 3,sackOK,timestamp 1 0,eol>

Also, as an immediate result TCP_SIGNATURE started to work instead on
panicing kernel after overflowing the 40 bytes opt buffer.

Patch is attached, opinions ?
--
Mihai
Joerg Sonnenberger
2010-12-29 22:02:38 UTC
Permalink
Post by Mihai Chelaru
I'd like to drop paddings for wscale, sack, tcpsig and timestamps in syn
options.
Just make sure that at least the timestamps option it properly 32bit
aligned.

Joerg

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mihai Chelaru
2010-12-29 22:05:35 UTC
Permalink
Post by Joerg Sonnenberger
Post by Mihai Chelaru
I'd like to drop paddings for wscale, sack, tcpsig and timestamps in syn
options.
Just make sure that at least the timestamps option it properly 32bit
aligned.
Joerg
I align it for non-SYN segments only as RFC 1323 appendix A recommends.
--
Mihai


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Christos Zoulas
2010-12-29 22:10:48 UTC
Permalink
-=-=-=-=-=-
Hi,
I'd like to drop paddings for wscale, sack, tcpsig and timestamps in syn
options.
Currently our TCP code is padding every option to 32bit, although this
is not required by RFC793 or other standards except a special case
for timestamps as far as I know. For example on NetBSD 5.1(RC2) at a
simple telnet we end up sending SYNs with 4 consecutive nop options and
<mss 1300,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
<mss 1300,wscale 3,sackOK,timestamp 1 0,eol>
Also, as an immediate result TCP_SIGNATURE started to work instead on
panicing kernel after overflowing the 40 bytes opt buffer.
Patch is attached, opinions ?
If the options happened to be properly aligned, you don't send TCPOPT_EOL
anymore?

christos


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mihai Chelaru
2010-12-29 22:20:21 UTC
Permalink
Post by Christos Zoulas
If the options happened to be properly aligned, you don't send TCPOPT_EOL
anymore?
RFC793 says so about EOL option:

This option code indicates the end of the option list. This
might not coincide with the end of the TCP header according to
the Data Offset field. This is used at the end of all options,
not the end of each option, and need only be used if the end of
the options would not otherwise coincide with the end of the TCP
header.
--
Mihai


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Greg Troxel
2010-12-29 22:45:30 UTC
Permalink
Post by Mihai Chelaru
I'd like to drop paddings for wscale, sack, tcpsig and timestamps in syn
options.
Currently our TCP code is padding every option to 32bit, although this
is not required by RFC793 or other standards except a special case
for timestamps as far as I know. For example on NetBSD 5.1(RC2) at a
simple telnet we end up sending SYNs with 4 consecutive nop options and
<mss 1300,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
<mss 1300,wscale 3,sackOK,timestamp 1 0,eol>
Also, as an immediate result TCP_SIGNATURE started to work instead on
panicing kernel after overflowing the 40 bytes opt buffer.
Patch is attached, opinions ?
Could you explain

how long you've been running this

the kinds of peers it's been tested against

how other implementations behave? (Specifically, does this make us
like the rest, or different?)
Mihai Chelaru
2010-12-30 15:06:53 UTC
Permalink
Post by Greg Troxel
Could you explain
how long you've been running this
This is my workstation, up since yesterday with normal tasks - browsing,
mail, radio, torrents, ssh/telnet/rdp client etc.

$ netstat -p tcp | head -n 30
tcp:
28575080 packets sent
22374753 data packets (8869011676 bytes)
348177 data packets (169228304 bytes) retransmitted
4891895 ack-only packets (5772157 delayed)
0 URG only packets
659 window probe packets
805796 window update packets
153802 control packets
0 send attempts resulted in self-quench
24712387 packets received
11815518 acks (for 8297119379 bytes)
0 duplicate acks
0 acks for unsent data
9557536 packets (4436243925 bytes) received in-sequence
2824139 completely duplicate packets (26028580 bytes)
1782 old duplicate packets
19228 packets with some dup. data (4133666 bytes duped)
273807 out-of-order packets (126071925 bytes)
0 packets (0 bytes) of data after window
0 window probes
77704 window update packets
92109 packets received after close
216 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
18219 connection requests
107319 connection accepts
118912 connections established (including accepts)
125808 connections closed (including 1134 drops)
Post by Greg Troxel
the kinds of peers it's been tested against
windows, freebsd, linux, cisco, maemo
Post by Greg Troxel
how other implementations behave? (Specifically, does this make us
like the rest, or different?)
Bytes requested by each option:

SACKP WScale Timestamps Use_EOL

NetBSD 4 4 12 N
FreeBSD* 2-3 3-4 10-13 Y
Linux-2627 2 4 10 Y
Win2003 4 4 12 N(?)
Cisco IOS 4 4 10 Y
WinXP 4 4 12 N(?)
Win7 4 4 ? N(?)
Required 2 3 10** Y

* - FreeBSD starts/ends SACKPERM at 2 bytes boundary
ends WScale at 4 bytes boundary
ends TS at 4 bytes boundary
starts/ends MSS at 4 bytes boundary
** - 12 for non-SYN segments


Somehow I didn't add tcp_input.c patch in the first mail. I attached it now.
--
Mihai
Chuck Swiger
2011-01-03 22:04:19 UTC
Permalink
Hi, Mihai--

I'd meant to reply to this earlier, but holiday travel interfered. :-)
Post by Mihai Chelaru
Post by Greg Troxel
Could you explain
how long you've been running this
This is my workstation, up since yesterday with normal tasks - browsing,
mail, radio, torrents, ssh/telnet/rdp client etc.
It's important to note that (most?) TCP stacks will resend SYN packets with fewer TCP options set if the initial SYN request doesn't get answered in some fashion. To validate that your changes are working as expected, you should tcpdump filtering for SYN packets and watch out for duplicate SYNs being sent.
Post by Mihai Chelaru
Post by Greg Troxel
how other implementations behave? (Specifically, does this make us
like the rest, or different?)
SACKP WScale Timestamps Use_EOL
NetBSD 4 4 12 N
FreeBSD* 2-3 3-4 10-13 Y
Linux-2627 2 4 10 Y
Win2003 4 4 12 N(?)
Cisco IOS 4 4 10 Y
WinXP 4 4 12 N(?)
Win7 4 4 ? N(?)
Required 2 3 10** Y
* - FreeBSD starts/ends SACKPERM at 2 bytes boundary
ends WScale at 4 bytes boundary
ends TS at 4 bytes boundary
starts/ends MSS at 4 bytes boundary
** - 12 for non-SYN segments
Hmm, I find it more useful to describe TCP options by strings representing the options in the sequence they appear, which is the mechanism various OS fingerprinting tools like NMAP, p0f, etc use.

NetBSD: MNWSNNNT

Cisco IOS (11, 12): M
Cisco VPN concentrators: MNNS
FreeBSD (4.x & early 5.x): MNWNNT
FreeBSD (5.3+, 6.x): MNWNNTNNS
FreeBSD (7.x+): MNWST
Linux: MSTNW (although MTWSN and others can appear)
MacOS X: MNWNNTS (10.3 or older uses MNWNNT, similar to FreeBSD 4.x)
NMAP probes: WNMT
OpenBSD: MNNSNWNNT (HP/UX 11.x also uses this)
Solaris: NNTNWM
Win 98/ME: MNNS
Win 2000/XP SP2 or older: MNWNNS
Win 2000 (Server variants)/2003/XP (SP3): MNWNNTNNS

There can be a lot of variation depending on whether SACK, TCP timestamps, RFC 1323 extensions, etc are enabled. However, it was widely common to use "NNT" to ensure that the timestamp would be 32-bit aligned; and somewhat common to use "NNS" for similar reasons (although it shouldn't be required, as you've noted).

The approach you're using of compressing the options as best you can, and then adding TCPOPT_EOL(s) at the end to maintain the 32-bit alignment looks good.

Regards,
--
-Chuck


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mihai Chelaru
2011-01-04 09:48:35 UTC
Permalink
Post by Chuck Swiger
It's important to note that (most?) TCP stacks will resend SYN packets with fewer TCP options set if the initial SYN request doesn't get answered in some fashion. To validate that your changes are working as expected, you should tcpdump filtering for SYN packets and watch out for duplicate SYNs being sent.
Hi Chuck !

It's not NetBSD case, where by default all 3 retransmits will use the
same options.

I modified this morning a little bit that code, and let initial syn and
first retransmit with full options, the second retransmit with MSS
option only and the last one with no options at all, and added some
stats. Until now results look good:

1072 connection requests
1654 connection accepts
2196 connections established (including accepts)
2721 connections closed (including 47 drops)
145 embryonic connections dropped
400 SYN options degraded -> incremented at second syn re-xmit
2 connected with no options -> connection established after
second syn re-xmit

I'll keep an eye on the last counter in the following days.
Post by Chuck Swiger
Hmm, I find it more useful to describe TCP options by strings representing the options in the sequence they appear, which is the mechanism various OS fingerprinting tools like NMAP, p0f, etc use.
NetBSD: MNWSNNNT
It's even MNWSNNNNT (one more N).
Post by Chuck Swiger
Cisco IOS (11, 12): M
Cisco VPN concentrators: MNNS
FreeBSD (4.x& early 5.x): MNWNNT
FreeBSD (5.3+, 6.x): MNWNNTNNS
FreeBSD (7.x+): MNWST
Linux: MSTNW (although MTWSN and others can appear)
MacOS X: MNWNNTS (10.3 or older uses MNWNNT, similar to FreeBSD 4.x)
MNWNNTSNN or MNWNNTSE0 ?
Post by Chuck Swiger
NMAP probes: WNMT
OpenBSD: MNNSNWNNT (HP/UX 11.x also uses this)
Solaris: NNTNWM
Win 98/ME: MNNS
Win 2000/XP SP2 or older: MNWNNS
Win 2000 (Server variants)/2003/XP (SP3): MNWNNTNNS
There can be a lot of variation depending on whether SACK, TCP timestamps, RFC 1323 extensions, etc are enabled. However, it was widely common to use "NNT" to ensure that the timestamp would be 32-bit aligned; and somewhat common to use "NNS" for similar reasons (although it shouldn't be required, as you've noted).
The approach you're using of compressing the options as best you can, and then adding TCPOPT_EOL(s) at the end to maintain the 32-bit alignment looks good.
Regards,
Thanks,
--
Mihai

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Chuck Swiger
2011-01-04 18:08:03 UTC
Permalink
Hi, Mihai--
It's not NetBSD case, where by default all 3 retransmits will use the same options.
1072 connection requests
1654 connection accepts
2196 connections established (including accepts)
2721 connections closed (including 47 drops)
145 embryonic connections dropped
400 SYN options degraded -> incremented at second syn re-xmit
2 connected with no options -> connection established after second syn re-xmit
I'll keep an eye on the last counter in the following days.
Excellent. While TCP stacks are required to implement NOP, EOL, and MSS, and should also be written to simply pass over options which they do not understand, some don't, and some folks use firewalls which drop various options, so falling back to M only is a good procedure.
Post by Chuck Swiger
Hmm, I find it more useful to describe TCP options by strings representing the options in the sequence they appear, which is the mechanism various OS fingerprinting tools like NMAP, p0f, etc use.
NetBSD: MNWSNNNT
It's even MNWSNNNNT (one more N).
OK.
Post by Chuck Swiger
MacOS X: MNWNNTS (10.3 or older uses MNWNNT, similar to FreeBSD 4.x)
MNWNNTSNN or MNWNNTSE0 ?
MNWNNTSE0:

09:54:14.219381 IP6 2620::1b00:2211:217:f2ff:fe08:ae62.55065 > www.netbsd.org.http: Flags [S], seq 2507270230, win 65535, options [mss 1440,nop,wscale 2,nop,nop,TS val 425013903 ecr 0,sackOK,eol], length 0
0x0000: 6000 0000 002c 0640 2620 0000 1b00 2211 `....,.@&.....".
0x0010: 0217 f2ff fe08 ae62 2001 04f8 0003 0007 .......b........
0x0020: 02e0 81ff fe52 9a6b d719 0050 9571 e856 .....R.k...P.q.V
0x0030: 0000 0000 b002 ffff 4ea9 0000 0204 05a0 ........N.......
0x0040: 0103 0302 0101 080a 1955 328f 0000 0000 .........U2.....
0x0050: 0402 0000 ....

Most implementations will emit trailing EOLs / zeros as needed to pad out to a 32-bit boundary, rather than using trailing NOPS.

Regards,
--
-Chuck


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mihai Chelaru
2011-01-06 17:26:26 UTC
Permalink
Post by Mihai Chelaru
I'll keep an eye on the last counter in the following days.
I need to reboot this one so I attached the latest stats and the
sys/netinet diff. I haven't seen anything strange during usage until now.
--
Mihai
Loading...