checking m->m_pkthdr.csum_flags in ip

Hi,

In message <***@back-street.net>
on Tue, 15 Apr 2008 20:32:16 +0900 (JST),

Post by Takahiro Kambe
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.

...

Post by Takahiro Kambe
The kernel has DIAGNOSTIC option enabled and corresponding code
fragments in ip_output().
#ifdef DIAGNOSTIC
if ((m->m_flags & M_PKTHDR) == 0)
panic("ip_output: no HDR");
if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv6|M_CSUM_UDPv6)) != 0) {
panic("ip_output: IPv6 checksum offload flags: %d",
m->m_pkthdr.csum_flags);
}
if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv4|M_CSUM_UDPv4)) ==
(M_CSUM_TCPv4|M_CSUM_UDPv4)) {
panic("ip_output: conflicting checksum offload flags: %d",
m->m_pkthdr.csum_flags);
}
#endif
It seems that this diagnostic code checking M_CSUM_TCPv4 and
M_CSUM_UDPv4 are exclusive one.

I confirmed that bge(4) sets both M_CSUM_TCPv4 and M_CSUM_UDPv4 to
m->m_pkthdr.csum_flags with usual unicast IP packets.

I don't know it is bug of bge(4) or above DIAGNOSTIC is wrong or
obsolete.

--
Takahiro Kambe <***@back-street.net>

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Thor Lancelot Simon

2008-05-04 16:53:07 UTC

Post by Takahiro Kambe
Hi,
on Tue, 15 Apr 2008 20:32:16 +0900 (JST),

Post by Takahiro Kambe
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.

This is a bug in the multicast forwarding code (it is related to a bug
I have been investigating in ipf, pf, and bridge). Look at ip_forward()
and ip_flow(): on NetBSD, if you forward a packet using the same mbuf
in which it was received, you must set csum_flags to 0 before handing
that packet to ip_output.

This is because the same flags were used for "hardware checked checksum
on receive" and "hardware should insert checksum on transmit" which, in
my opinion, was a mistake. The existing M_CSUM_DATA and M_CSUM_NO_PSEUDOHDR
would have been sufficient for receive, leaving M_CSUM_TCPv4 (etc.) for
transmit use.

As it is now, if you receive such a packet and forward it without looking
inside the UDP or TCP layer, you can in fact cause the hardware to stamp
a *good* checksum on a packet which had a *bad* one when received, because
you won't check for M_CSUM_TCPUDP_BAD, but will send the packet into
ip_output() with M_CSUM_TCPv4 or M_CSUM_UDPv4 set, because that's how it
was received.

Anyway, all code forwarding packets on NetBSD must explicitly set
csum_flags to 0 after receive because of this.

Thor

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Takahiro Kambe

2008-05-08 05:14:19 UTC

Post by Thor Lancelot Simon

Hi,

In message <***@panix.com>
on Sun, 4 May 2008 12:53:07 -0400,

Post by Takahiro Kambe
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.

Thanks very much for your explanation.

Post by Thor Lancelot Simon
This is because the same flags were used for "hardware checked checksum
on receive" and "hardware should insert checksum on transmit" which, in
my opinion, was a mistake. The existing M_CSUM_DATA and M_CSUM_NO_PSEUDOHDR
would have been sufficient for receive, leaving M_CSUM_TCPv4 (etc.) for
transmit use.

I agree your opinion.

Post by Thor Lancelot Simon
Anyway, all code forwarding packets on NetBSD must explicitly set
csum_flags to 0 after receive because of this.

Though I don't understand codes in ip_mroute.c very well, attached
patch might be things make better. (Not tested since I don't have
testing environment now, hoping it could test in this month.)
--
Takahiro Kambe <***@back-street.net>

Index: sys/netinet/ip_mroute.c
===================================================================
RCS file: /cvs/src-4/sys/netinet/ip_mroute.c,v
retrieving revision 1.1.1.1
diff -u -p -d -d -u -p -r1.1.1.1 ip_mroute.c
--- sys/netinet/ip_mroute.c 7 Feb 2007 01:50:26 -0000 1.1.1.1
+++ sys/netinet/ip_mroute.c 6 May 2008 04:51:24 -0000
@@ -1425,6 +1425,11 @@ ip_mforward(struct mbuf *m, struct ifnet
return (1);
}

+ /*
+ * Clear any in-bound checksum flags for this packet.
+ */
+ m->m_pkthdr.csum_flags = 0;
+
#ifdef RSVP_ISI
if (imo && ((vifi = imo->imo_multicast_vif) < numvifs)) {
if (ip->ip_ttl < MAXTTL)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Jason Thorpe

2008-05-08 06:24:36 UTC

Post by Takahiro Kambe

Post by Thor Lancelot Simon
Anyway, all code forwarding packets on NetBSD must explicitly set
csum_flags to 0 after receive because of this.

Though I don't understand codes in ip_mroute.c very well, attached
patch might be things make better. (Not tested since I don't have
testing environment now, hoping it could test in this month.)

Your patch looks correct. Please check it in.

Post by Takahiro Kambe
--
Index: sys/netinet/ip_mroute.c
===================================================================
RCS file: /cvs/src-4/sys/netinet/ip_mroute.c,v
retrieving revision 1.1.1.1
diff -u -p -d -d -u -p -r1.1.1.1 ip_mroute.c
--- sys/netinet/ip_mroute.c 7 Feb 2007 01:50:26 -0000 1.1.1.1
+++ sys/netinet/ip_mroute.c 6 May 2008 04:51:24 -0000
@@ -1425,6 +1425,11 @@ ip_mforward(struct mbuf *m, struct ifnet
return (1);
}
+ /*
+ * Clear any in-bound checksum flags for this packet.
+ */
+ m->m_pkthdr.csum_flags = 0;
+
#ifdef RSVP_ISI
if (imo && ((vifi = imo->imo_multicast_vif) < numvifs)) {
if (ip->ip_ttl < MAXTTL)

-- thorpej

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Takahiro Kambe

2008-05-08 08:02:51 UTC

In message <8B68CB47-1D64-48A0-A8C1-***@shagadelic.org>
on Wed, 7 May 2008 23:24:36 -0700,

Post by Jason Thorpe

Post by Takahiro Kambe

Post by Thor Lancelot Simon
Anyway, all code forwarding packets on NetBSD must explicitly set
csum_flags to 0 after receive because of this.

Though I don't understand codes in ip_mroute.c very well, attached
patch might be things make better. (Not tested since I don't have
testing environment now, hoping it could test in this month.)

Your patch looks correct. Please check it in.

Done. And I'll request pull-up to netbsd-4 branch and this problem
dosen't exist netbsd-3 branch and before.

--
Takahiro Kambe <***@back-street.net>

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Patrick Welche

2008-05-16 14:05:03 UTC

Post by Takahiro Kambe
Hi,
on Tue, 15 Apr 2008 20:32:16 +0900 (JST),

Post by Takahiro Kambe
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.

...

I confirmed that bge(4) sets both M_CSUM_TCPv4 and M_CSUM_UDPv4 to
m->m_pkthdr.csum_flags with usual unicast IP packets.
I don't know it is bug of bge(4) or above DIAGNOSTIC is wrong or
obsolete.

Don't know whether relevant, but a 4.99.60/i386 box with bge gave:

uvm_fault(0xcdfae574, 0, 1) -> 0xe
kernel: supervisor trap page fault, code=0
Stopped in pid 22172.1 (dhcpd) at 0xc03a6f25: movl 0x14(%eax),%eax
db{1}> bt/l
m_length(0,0,cd985abc,c0377c4f,5) at 0xc03a6f25
bpf_mtap(c2d822c0,0,cd985aec,c03a8f5d,cd985a05) at netbsd:bpf_mtap+0x17
bge_start(c2da7004,178,9000003,3,0) at netbsd:bge_start+0x10c
ifq_enqueue(c2da7004,c3111300,c2da7004,2,cdfae574) at netbsd:ifq_enqueue+0x13f
ether_output(c2da7004,c3111300,c06077a0,0,c06077a0) at netbsd:ether_output+0x71e
bpf_write(cdc82300,cdc82300,cd985c60,d5bf99c0,1) at netbsd:bpf_write+0x126
do_filewritev(7,bfbfc668,3,cdc82300,1) at netbsd:do_filewritev+0x270
sys_writev(cdfac900,cd985d04,cd985cfc,cd985d10,c03d0d79) at netbsd:sys_writev+0x3f
syscall(cd985d48,b3,ab,bfbf001f,bfbf001f) at netbsd:syscall+0x141

yesterday...

Cheers,

Patrick

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

David Young

2008-05-16 22:14:10 UTC

Post by Takahiro Kambe
Hi,
on Tue, 15 Apr 2008 20:32:16 +0900 (JST),

Post by Takahiro Kambe
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.

...

I confirmed that bge(4) sets both M_CSUM_TCPv4 and M_CSUM_UDPv4 to
m->m_pkthdr.csum_flags with usual unicast IP packets.
I don't know it is bug of bge(4) or above DIAGNOSTIC is wrong or
obsolete.

uvm_fault(0xcdfae574, 0, 1) -> 0xe
kernel: supervisor trap page fault, code=0
Stopped in pid 22172.1 (dhcpd) at 0xc03a6f25: movl 0x14(%eax),%eax
db{1}> bt/l
m_length(0,0,cd985abc,c0377c4f,5) at 0xc03a6f25
bpf_mtap(c2d822c0,0,cd985aec,c03a8f5d,cd985a05) at netbsd:bpf_mtap+0x17
bge_start(c2da7004,178,9000003,3,0) at netbsd:bge_start+0x10c
ifq_enqueue(c2da7004,c3111300,c2da7004,2,cdfae574) at netbsd:ifq_enqueue+0x13f
ether_output(c2da7004,c3111300,c06077a0,0,c06077a0) at netbsd:ether_output+0x71e
bpf_write(cdc82300,cdc82300,cd985c60,d5bf99c0,1) at netbsd:bpf_write+0x126
do_filewritev(7,bfbfc668,3,cdc82300,1) at netbsd:do_filewritev+0x270
sys_writev(cdfac900,cd985d04,cd985cfc,cd985d10,c03d0d79) at netbsd:sys_writev+0x3f
syscall(cd985d48,b3,ab,bfbf001f,bfbf001f) at netbsd:syscall+0x141
yesterday...

It looks like IFQ_POLL()/IFQ_DEQUEUE() did not honor their contract. In
order to reach the bpf_mtap() statement, IFQ_POLL() had to return m_head
!= NULL. According to altq(9), "It is guaranteed that IFQ_DEQUEUE()
immediately after IFQ_POLL() returns the same packet."

Are you using ALTQ?

Dave

--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Patrick Welche

2008-05-17 12:37:58 UTC

Post by Takahiro Kambe
Hi,
on Tue, 15 Apr 2008 20:32:16 +0900 (JST),

Post by Takahiro Kambe
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.

...

I confirmed that bge(4) sets both M_CSUM_TCPv4 and M_CSUM_UDPv4 to
m->m_pkthdr.csum_flags with usual unicast IP packets.
I don't know it is bug of bge(4) or above DIAGNOSTIC is wrong or
obsolete.

No altq - also this was a kernel from 21st April - I hope I didn't
hijack Takahiro's thread - just noticed that they were both with
bge.

Cheers,

Patrick

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

David Young

2008-05-19 17:36:18 UTC

Post by Patrick Welche
uvm_fault(0xcdfae574, 0, 1) -> 0xe
kernel: supervisor trap page fault, code=0
Stopped in pid 22172.1 (dhcpd) at 0xc03a6f25: movl 0x14(%eax),%eax
db{1}> bt/l
m_length(0,0,cd985abc,c0377c4f,5) at 0xc03a6f25
bpf_mtap(c2d822c0,0,cd985aec,c03a8f5d,cd985a05) at netbsd:bpf_mtap+0x17
bge_start(c2da7004,178,9000003,3,0) at netbsd:bge_start+0x10c
ifq_enqueue(c2da7004,c3111300,c2da7004,2,cdfae574) at netbsd:ifq_enqueue+0x13f
ether_output(c2da7004,c3111300,c06077a0,0,c06077a0) at netbsd:ether_output+0x71e
bpf_write(cdc82300,cdc82300,cd985c60,d5bf99c0,1) at netbsd:bpf_write+0x126
do_filewritev(7,bfbfc668,3,cdc82300,1) at netbsd:do_filewritev+0x270
sys_writev(cdfac900,cd985d04,cd985cfc,cd985d10,c03d0d79) at netbsd:sys_writev+0x3f
syscall(cd985d48,b3,ab,bfbf001f,bfbf001f) at netbsd:syscall+0x141
yesterday...

No altq - also this was a kernel from 21st April - I hope I didn't
hijack Takahiro's thread - just noticed that they were both with
bge.

Is this an SMP box? I don't know how this could happen unless a second
thread or an interrupt handler ran bge_start() simultaneously with the
thread where the fault occurred. Looking at the bge(4) code, I don't
see how that could happen.

Dave

Andrew Doran

2008-05-19 17:43:04 UTC

No altq - also this was a kernel from 21st April - I hope I didn't
hijack Takahiro's thread - just noticed that they were both with
bge.

Without looking at the code, it seems that the bpf fileops need to take
kernel_lock.

Andrew

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Patrick Welche

2008-05-19 18:57:36 UTC