Discussion:
ipf - RST packet w. sequence num.
(too old to reply)
rudolf
2009-11-17 21:47:25 UTC
Permalink
Hi,

i think i've found one disagreement about what our tcp stack thinks
belongs to communication session and what ipf thinks belongs to the
session ("keep state").

Situation:

my_comp ----- Internet ----- other_comp

on other_comp:
================
- port 18000 is closed

# tcpdump -n -p -vvv tcp port 18000
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96
bytes
14:05:19.645888 IP (tos 0x0, ttl 46, id 0, offset 0, flags [DF], proto
TCP (6), length 64) IP0.65351 > IP1.18000: S, cksum 0x8a20 (correct),
3076040396:3076040396(0) win 32768 <mss 1380,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
14:05:19.646079 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 40) IP1.18000 > IP0.65351: R, cksum 0x83a3 (correct),
0:0(0) ack 3076040397 win 0

on my_comp:
=============
# ipfstat -io
block out all
pass out quick on lo0 all
pass out quick on rtk0 proto tcp from any to any keep state
pass out quick on rtk0 proto udp from any to any keep state
pass out quick on rtk0 proto icmp from any to any keep state
block in all
pass in quick on lo0 all

# tcpdump -n -p -vvv tcp port 18000
tcpdump: listening on rtk0, link-type EN10MB (Ethernet), capture size 96
bytes
20:09:20.158824 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 64) IP0.65351 > IP1.18000: S, cksum 0xbda7 (correct),
2102555286:2102555286(0) win 32768 <mss 1460,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
20:09:20.451227 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto
TCP (6), length 40) IP1.18000 > IP0.65351: R, cksum 0xa7f2 (correct),
810606391:810606391(0) ack 2102555287 win 0

(launched "telnet IP1 18000" on my_comp)

Although the other_comp is sending the RST packet with sequence number 0
(which is correct, if i read the RFC0793/STD0007 paper right (3.4; Reset
Generation; 1.): "If the incoming segment has an ACK field, the reset
takes its sequence number from the ACK field of the segment, otherwise
the reset has sequence number zero and ..."), it reaches my_comp with
sequence number set to 810606391 (there may be some Cisco ASA between
the computers, i am not sure).

The problem is, that with the aforementioned ipf rules the packet is not
being recognized as being part of the "communication session" and is
dropped by the ipf. Thus the telnet is trying to send another SYNs and
eventually times out. If i remove all ipf rules, the RST packet is
recognized by the tcp stack as correct and is returned to telnet and it
immediately reports, that the connection was refused.

I am no expert and i am not sure, if i read all the info correctly.
Anyway, I am curious:
1) why do we recognize this RST to be correct in our stack (iiuc, this
packet violates the mentioned rfc/standard?)
2) if there are reasons allowing such packets, why does the ipf not
recognize them as correct?

This is on netbsd-5,
# ipf -V
ipf: IP Filter: v4.1.29 (396)
Kernel: IP Filter: v4.1.29
[...]

(btw. if the RST packet has seq. num. 0, it is correctly recognized by
the ipf "keep state" rule as part of the session)

Thanks,

r.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2009-11-17 22:00:31 UTC
Permalink
Post by rudolf
# tcpdump -n -p -vvv tcp port 18000
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
14:05:19.645888 IP (tos 0x0, ttl 46, id 0, offset 0, flags [DF], proto TCP (6), length 64) IP0.65351 > IP1.18000: S, cksum 0x8a20 (correct), 3076040396:3076040396(0) win 32768 <mss 1380,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
14:05:19.646079 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) IP1.18000 > IP0.65351: R, cksum 0x83a3 (correct), 0:0(0) ack 3076040397 win 0
# tcpdump -n -p -vvv tcp port 18000
tcpdump: listening on rtk0, link-type EN10MB (Ethernet), capture size 96 bytes
20:09:20.158824 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64) IP0.65351 > IP1.18000: S, cksum 0xbda7 (correct), 2102555286:2102555286(0) win 32768 <mss 1460,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
20:09:20.451227 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto TCP (6), length 40) IP1.18000 > IP0.65351: R, cksum 0xa7f2 (correct), 810606391:810606391(0) ack 2102555287 win 0
Although the other_comp is sending the RST packet with sequence
number 0 [...]
It probably is not. tcpdump normally silently converts TCP sequence
numbers to relative numbers, ie it subtracts the base sequence number
for the connection as learned from the first packet seen on the
connection. This is not, however, done for the initial packet, ie, the
one from which tcpdump learns the base sequence number

Try tcpdumping a successful connection and looking at the sequence
numbers. Here's an example I generated just now by running a server on
10.0.1.1 port 12345 which just did "echo foo", then connecting to it
from 10.0.1.2:

(captured on 10.0.1.1)

17:04:43.763725 10.0.1.2.65347 > 10.0.1.1.12345: S 887782560:887782560(0) win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 5826875 0>
17:04:43.767954 10.0.1.1.12345 > 10.0.1.2.65347: S 1041287611:1041287611(0) ack 887782561 win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 5619379 5826875> (DF)
17:04:43.768673 10.0.1.2.65347 > 10.0.1.1.12345: . ack 1 win 17520 <nop,nop,timestamp 5826875 5619379>
17:04:45.762192 10.0.1.1.12345 > 10.0.1.2.65347: P 1:5(4) ack 1 win 17520 <nop,nop,timestamp 5619383 0> (DF)
17:04:45.763800 10.0.1.1.12345 > 10.0.1.2.65347: F 5:5(0) ack 1 win 17520 <nop,nop,timestamp 5619383 0> (DF)
17:04:45.764519 10.0.1.2.65347 > 10.0.1.1.12345: . ack 6 win 17520 <nop,nop,timestamp 5826879 5619383>
17:04:45.766145 10.0.1.2.65347 > 10.0.1.1.12345: F 1:1(0) ack 6 win 17520 <nop,nop,timestamp 5826879 5619383>
17:04:45.768961 10.0.1.1.12345 > 10.0.1.2.65347: . ack 2 win 17520 <nop,nop,timestamp 5619383 5826879> (DF)

(captured on 10.0.1.2):

17:04:43.760448 10.0.1.2.65347 > 10.0.1.1.12345: S 887782560:887782560(0) win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 5826875 0>
17:04:43.765337 10.0.1.1.12345 > 10.0.1.2.65347: S 1041287611:1041287611(0) ack 887782561 win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp 5619379 5826875> (DF)
17:04:43.765422 10.0.1.2.65347 > 10.0.1.1.12345: . ack 1 win 17520 <nop,nop,timestamp 5826875 5619379>
17:04:45.759527 10.0.1.1.12345 > 10.0.1.2.65347: P 1:5(4) ack 1 win 17520 <nop,nop,timestamp 5619383 0> (DF)
17:04:45.761151 10.0.1.1.12345 > 10.0.1.2.65347: F 5:5(0) ack 1 win 17520 <nop,nop,timestamp 5619383 0> (DF)
17:04:45.761217 10.0.1.2.65347 > 10.0.1.1.12345: . ack 6 win 17520 <nop,nop,timestamp 5826879 5619383>
17:04:45.762757 10.0.1.2.65347 > 10.0.1.1.12345: F 1:1(0) ack 6 win 17520 <nop,nop,timestamp 5826879 5619383>
17:04:45.766312 10.0.1.1.12345 > 10.0.1.2.65347: . ack 2 win 17520 <nop,nop,timestamp 5619383 5826879> (DF)

Note how the sequence numbers in each direction, after the initial
packet each way, are printed as small integers.

The tcpdump I have here has an option, -S, which disables this. Given
the difference in output format, yours is clearly a different version,
but you might look for an option with similar semantics in yours.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
rudolf
2009-11-18 00:14:12 UTC
Permalink
Post by der Mouse
Post by rudolf
# tcpdump -n -p -vvv tcp port 18000
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
14:05:19.645888 IP (tos 0x0, ttl 46, id 0, offset 0, flags [DF], proto TCP (6), length 64) IP0.65351 > IP1.18000: S, cksum 0x8a20 (correct), 3076040396:3076040396(0) win 32768 <mss 1380,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
14:05:19.646079 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) IP1.18000 > IP0.65351: R, cksum 0x83a3 (correct), 0:0(0) ack 3076040397 win 0
# tcpdump -n -p -vvv tcp port 18000
tcpdump: listening on rtk0, link-type EN10MB (Ethernet), capture size 96 bytes
20:09:20.158824 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64) IP0.65351 > IP1.18000: S, cksum 0xbda7 (correct), 2102555286:2102555286(0) win 32768 <mss 1460,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
20:09:20.451227 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto TCP (6), length 40) IP1.18000 > IP0.65351: R, cksum 0xa7f2 (correct), 810606391:810606391(0) ack 2102555287 win 0
Although the other_comp is sending the RST packet with sequence
number 0 [...]
It probably is not. tcpdump normally silently converts TCP sequence
numbers to relative numbers, ie it subtracts the base sequence number
for the connection as learned from the first packet seen on the
connection. This is not, however, done for the initial packet, ie, the
one from which tcpdump learns the base sequence number
Try tcpdumping a successful connection and looking at the sequence
numbers.
[...]

Ok, thanks, here are the tcpdump sessions with the "-S" ("Print
absolute, rather than relative, TCP sequence numbers.") option turned on:

1) Here is a session with expected result (the application receives
correctly ECONNREFUSED), ipf rules applied:

my_comp ---- Internet ---- other_comp_correct

my_comp:
==========
tcpdump version 3.9.7
libpcap version 0.9.4

# tcpdump -n -S -p -vvv tcp port 17000
tcpdump: listening on rtk0, link-type EN10MB (Ethernet), capture size 96
bytes
00:43:36.092862 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 64) IP0.65530 > IP1.17000: S, cksum 0x340a (correct),
1925369280:1925369280(0) win 32768 <mss 1460,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
00:43:36.139869 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto
TCP (6), length 40) IP1.17000 > IP0.65530: R, cksum 0x2ddd (correct),
0:0(0) ack 1925369281 win 0

other_comp_correct:
=====================
tcpdump version 3.9.5
libpcap version 0.9.5

# tcpdump -n -p -S -vvv tcp port 17000
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96
bytes
00:39:42.934107 IP (tos 0x0, ttl 56, id 0, offset 0, flags [DF], proto:
TCP (6), length: 64) IP0.65530 > IP1.17000: S, cksum 0x6e6f (correct),
1925369280:1925369280(0) win 32768 <mss 1460,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
00:39:42.934155 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto:
TCP (6), length: 40) IP1.17000 > IP0.65530: R, cksum 0x6842 (correct),
0:0(0) ack 1925369281 win 0

2) Here is a session, where the application waits for a timeout, ipf
rules applied:

my_comp ---- Internet ---- other_comp_incorrect

my_comp:
==========
tcpdump version 3.9.7
libpcap version 0.9.4

# tcpdump -n -p -S -vvv tcp port 18000
tcpdump: listening on rtk0, link-type EN10MB (Ethernet), capture size 96
bytes
01:01:00.704813 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 64) IP0.65522 > IP2.18000: S, cksum 0xbef6 (correct),
2749057555:2749057555(0) win 32768 <mss 1460,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
01:01:00.844241 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto
TCP (6), length 40) IP2.18000 > IP0.65522: R, cksum 0x7756 (correct),
2013055350:2013055350(0) ack 2749057556 win 0

other_comp_incorrect:
=====================
tcpdump version 3.9.8
libpcap version 0.9.8

# tcpdump -n -p -S -vvv tcp port 18000
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96
bytes
18:57:00.588170 IP (tos 0x0, ttl 46, id 0, offset 0, flags [DF], proto
TCP (6), length 64) IP0.65522 > IP2.18000: S, cksum 0x5b32 (correct),
3460546084:3460546084(0) win 32768 <mss 1380,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
18:57:00.588369 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 40) IP2.18000 > IP0.65522: R, cksum 0x54b5 (correct),
0:0(0) ack 3460546085 win 0
Post by der Mouse
The tcpdump I have here has an option, -S, which disables this. Given
the difference in output format, yours is clearly a different version,
but you might look for an option with similar semantics in yours.
So even with the absolute sequence numbers is the presence of sequence
number in the RST packet the only significant difference i can
recognize. As i wrote in previous email, i think it is probably caused
by some appliance between the my_comp and other_comp_incorrect machines,
probably some Cisco ASA. Now: is this a correct packet? If so, why does
the ipf not recognize it as being correct part of the session and our
tcp stack does?

Thanks,

r.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2009-11-18 02:53:54 UTC
Permalink
My apologies; I read the original mail too quickly and didn't notice
that the zero sequence number in question was not relative.
["correct"]
00:39:42.934107 IP (tos 0x0, ttl 56, id 0, offset 0, flags [DF], proto: TCP (6), length: 64) IP0.65530 > IP1.17000: S, cksum 0x6e6f (correct), 1925369280:1925369280(0) win 32768 <mss 1460,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
00:39:42.934155 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6), length: 40) IP1.17000 > IP0.65530: R, cksum 0x6842 (correct), 0:0(0) ack 1925369281 win 0
This looks right to me. 793 says

If the incoming segment has an ACK field, the reset takes its
sequence number from the ACK field of the segment, otherwise the
reset has sequence number zero and the ACK field is set to the sum
of the sequence number and segment length of the incoming segment.
The connection remains in the CLOSED state.
["incorrect"]
01:01:00.704813 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64) IP0.65522 > IP2.18000: S, cksum 0xbef6 (correct), 2749057555:2749057555(0) win 32768 <mss 1460,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
01:01:00.844241 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto TCP (6), length 40) IP2.18000 > IP0.65522: R, cksum 0x7756 (correct), 2013055350:2013055350(0) ack 2749057556 win 0
This looks to me like a fairly clear violation of the above spec from
793.

It's possible 793 has been updated in this respect. The RFC index
lists only two updates to 793 as of 2009-09-19 (1122 and 3168), even
though I know there are various other updates to TCP (eg, 2581); I
don't know of any way to mechanically identify everything that updates
TCP absent correct maintence of the updated-by fields in the index, and
I'm not about to read over five thousand RFCs - or even RFC titles - to
try to find any that might apply here.

The RST generation code in our tcp_input does use 0 as the sequence
number for RSTs under these circumstances, for what that's worth. Our
RST handling for SYN_SENT connections does not check this, though, so I
infer that the packet never made it to the TCP stack (unless the
SYN-sending host in question isn't NetBSD).

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
rudolf
2009-11-25 22:07:26 UTC
Permalink
Post by der Mouse
["incorrect"]
01:01:00.704813 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64) IP0.65522 > IP2.18000: S, cksum 0xbef6 (correct), 2749057555:2749057555(0) win 32768 <mss 1460,nop,wscale 3,sackOK,nop,nop,nop,nop,timestamp 1 0>
01:01:00.844241 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto TCP (6), length 40) IP2.18000 > IP0.65522: R, cksum 0x7756 (correct), 2013055350:2013055350(0) ack 2749057556 win 0
This looks to me like a fairly clear violation of the above spec from
793.
Nevertheless, NetBSD accepts this packet as correct if no ipf rules are
involved.

I wonder if there is any interest from the community to get the ipf's
and NetBSD's notion of "tcp session" to get to sync. - should i file a PR?

Btw. we solved it at work by leaving the "broken" server-hosting provider.

r.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2009-11-25 22:22:09 UTC
Permalink
Post by rudolf
Post by der Mouse
This looks to me like a fairly clear violation of the above spec
from 793.
Nevertheless, NetBSD accepts this packet as correct if no ipf rules
are involved.
I think it's a defensible position that this is correct behaviour.
Without filtering turned on, there's an implicit "this host is not
trying especially hard to defend itself against hostile network
behaviour", so in borderline cases it should err on the side of
accepting traffic.

I think it's also a defensible position that this is incorrect
behaviour; accepting nonconformant traffic rarely does much more than
pile up trouble for later, and there are so few pieces of the network
that aren't exposed to hostile traffic that I'm not sure it's worth
designing for them in a general-purpose system, especially by default.
Post by rudolf
I wonder if there is any interest from the community to get the ipf's
and NetBSD's notion of "tcp session" to get to sync. - should i file a PR?
I'm not sure it's a question of differing notions of "TCP session"; I
think it may be more that the TCP stack just doesn't bother checking
that particular field. (I don't recall looking at 793 to see whether
it says that field should be ignored on incoming RSTs.)
Post by rudolf
Btw. we solved it at work by leaving the "broken" server-hosting provider.
That would be one of my recommendations, certainly.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...