Takahiro Kambe
2008-04-15 11:32:16 UTC
Hi,
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.
#0 0xc0424c95 in cpu_reboot (howto=0x0, bootstr=0x0)
at ../../../../arch/i386/i386/machdep.c:896
#1 0xc039f588 in panic (
fmt=0xc062ba50 "ip_output: conflicting checksum offload flags: %d")
at ../../../../kern/subr_prf.c:246
#2 0xc012908b in ip_output (m0=0xc1f84600)
at ../../../../netinet/ip_output.c:246
#3 0xc0124332 in tbf_send_packet (vifp=0xc06d97d0, m=0x0)
at ../../../../netinet/ip_mroute.c:2222
#4 0xc01251f4 in ip_mdq (m=0xc1a33100, ifp=<value optimized out>,
rt=0xc1af3d00) at ../../../../netinet/ip_mroute.c:1885
#5 0xc0122582 in ip_input (m=0xc1a33100) at ../../../../netinet/ip_input.c:780
#6 0xc0122884 in ipintr () at ../../../../netinet/ip_input.c:471
#7 0xc010bcb5 in Xsoftnet ()
The kernel has DIAGNOSTIC option enabled and corresponding code
fragments in ip_output().
#ifdef DIAGNOSTIC
if ((m->m_flags & M_PKTHDR) == 0)
panic("ip_output: no HDR");
if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv6|M_CSUM_UDPv6)) != 0) {
panic("ip_output: IPv6 checksum offload flags: %d",
m->m_pkthdr.csum_flags);
}
if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv4|M_CSUM_UDPv4)) ==
(M_CSUM_TCPv4|M_CSUM_UDPv4)) {
panic("ip_output: conflicting checksum offload flags: %d",
m->m_pkthdr.csum_flags);
}
#endif
It seems that this diagnostic code checking M_CSUM_TCPv4 and
M_CSUM_UDPv4 are exclusive one.
I got crash dump and examine with gdb, above mbuf contains such value:
$1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xcb336810 "E",
mh_owner = 0x0, mh_len = 0x24, mh_flags = 0x9000203,
mh_paddr = 0x3942a100, mh_type = 0x1}, M_dat = {MH = {MH_pkthdr = {
rcvif = 0xc1b5e03c, tags = {slh_first = 0x0}, len = 0x24,
csum_flags = 0x8000004b, csum_data = 0x4210, segsz = 0x0}, MH_dat = {
...
I don't exactly know where this csum_flags was set to 0x8000004b:
M_CSUM_NO_PSEUDOHDR | M_CSUM_IPv4 | M_CSUM_DATA | M_CSUM_UDPv4 |M_CSUM_TCPv4
And this packet was recived by bge0 and if_bge.c has such code
fragment in bge_rxeof().
/*
* Rx transport checksum-offload may also
* have bugs with packets which, when transmitted,
* were `runts' requiring padding.
*/
if (cur_rx->bge_flags & BGE_RXBDFLAG_TCP_UDP_CSUM &&
(/* (sc->_bge_quirks & BGE_QUIRK_SHORT_CKSUM_BUG) == 0 ||*/
m->m_pkthdr.len >= ETHER_MIN_NOPAD)) {
m->m_pkthdr.csum_data =
cur_rx->bge_tcp_udp_csum;
m->m_pkthdr.csum_flags |=
(M_CSUM_TCPv4|M_CSUM_UDPv4|
M_CSUM_DATA|M_CSUM_NO_PSEUDOHDR);
}
But the packet was too short to set csum_flags here.
My question is:
- Is diagnostic code in ip_output() correct?
- How can I investigate the origin of this problem?
I can still access crash dump but the machine is running at my
customer with stopping mrouted.
Thanks in your advice.
Today, NetBSD 4.0_STABLE machine paniced in ip_output() when
forwarding IPv4 multicast packet. The packet was short (36 octets)
UDP/IP pakcet.
#0 0xc0424c95 in cpu_reboot (howto=0x0, bootstr=0x0)
at ../../../../arch/i386/i386/machdep.c:896
#1 0xc039f588 in panic (
fmt=0xc062ba50 "ip_output: conflicting checksum offload flags: %d")
at ../../../../kern/subr_prf.c:246
#2 0xc012908b in ip_output (m0=0xc1f84600)
at ../../../../netinet/ip_output.c:246
#3 0xc0124332 in tbf_send_packet (vifp=0xc06d97d0, m=0x0)
at ../../../../netinet/ip_mroute.c:2222
#4 0xc01251f4 in ip_mdq (m=0xc1a33100, ifp=<value optimized out>,
rt=0xc1af3d00) at ../../../../netinet/ip_mroute.c:1885
#5 0xc0122582 in ip_input (m=0xc1a33100) at ../../../../netinet/ip_input.c:780
#6 0xc0122884 in ipintr () at ../../../../netinet/ip_input.c:471
#7 0xc010bcb5 in Xsoftnet ()
The kernel has DIAGNOSTIC option enabled and corresponding code
fragments in ip_output().
#ifdef DIAGNOSTIC
if ((m->m_flags & M_PKTHDR) == 0)
panic("ip_output: no HDR");
if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv6|M_CSUM_UDPv6)) != 0) {
panic("ip_output: IPv6 checksum offload flags: %d",
m->m_pkthdr.csum_flags);
}
if ((m->m_pkthdr.csum_flags & (M_CSUM_TCPv4|M_CSUM_UDPv4)) ==
(M_CSUM_TCPv4|M_CSUM_UDPv4)) {
panic("ip_output: conflicting checksum offload flags: %d",
m->m_pkthdr.csum_flags);
}
#endif
It seems that this diagnostic code checking M_CSUM_TCPv4 and
M_CSUM_UDPv4 are exclusive one.
I got crash dump and examine with gdb, above mbuf contains such value:
$1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xcb336810 "E",
mh_owner = 0x0, mh_len = 0x24, mh_flags = 0x9000203,
mh_paddr = 0x3942a100, mh_type = 0x1}, M_dat = {MH = {MH_pkthdr = {
rcvif = 0xc1b5e03c, tags = {slh_first = 0x0}, len = 0x24,
csum_flags = 0x8000004b, csum_data = 0x4210, segsz = 0x0}, MH_dat = {
...
I don't exactly know where this csum_flags was set to 0x8000004b:
M_CSUM_NO_PSEUDOHDR | M_CSUM_IPv4 | M_CSUM_DATA | M_CSUM_UDPv4 |M_CSUM_TCPv4
And this packet was recived by bge0 and if_bge.c has such code
fragment in bge_rxeof().
/*
* Rx transport checksum-offload may also
* have bugs with packets which, when transmitted,
* were `runts' requiring padding.
*/
if (cur_rx->bge_flags & BGE_RXBDFLAG_TCP_UDP_CSUM &&
(/* (sc->_bge_quirks & BGE_QUIRK_SHORT_CKSUM_BUG) == 0 ||*/
m->m_pkthdr.len >= ETHER_MIN_NOPAD)) {
m->m_pkthdr.csum_data =
cur_rx->bge_tcp_udp_csum;
m->m_pkthdr.csum_flags |=
(M_CSUM_TCPv4|M_CSUM_UDPv4|
M_CSUM_DATA|M_CSUM_NO_PSEUDOHDR);
}
But the packet was too short to set csum_flags here.
My question is:
- Is diagnostic code in ip_output() correct?
- How can I investigate the origin of this problem?
I can still access crash dump but the machine is running at my
customer with stopping mrouted.
Thanks in your advice.
--
Takahiro Kambe <***@back-street.net>
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Takahiro Kambe <***@back-street.net>
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de