Discussion:
Strange TCP problem with awge0
(too old to reply)
Martin Husemann
2015-01-02 16:02:15 UTC
Permalink
A strange network problem, reproducable on netbsd-current when using an
ARM device with awge network interface (basically all Allwinner boards)
has been reported to me:

Simply trying to get anything from one of the KDE mirror sites fails
ftp ftp://ftp.solnet.ch/mirror/KDE/stable/
Connected to ftp.solnet.ch.
220- __ _ _ _ _
..
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.

421 Service not available, remote server timed out. Connection closed.


When running Raspbian on affected hardware and using tnftp, the transfer
"works":

...
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
200 Switching to Binary mode.
250 Directory successfully changed.
550 Failed to change directory.
221 Goodbye.


This is very reliably 100% reproducable, for me and others (with a totally
different network connection, so it is neither my ISP nor my router).

I have created some packet captures and stared at them, but I can't see
what is wrong - looks like the ftp server is not seeing/acking our last
packet, but *why* is that so, and why reproducably?

Also it works fine with other machines running netbsd-current on the same
network connection, so first bet would be a driver bug.

The traces are taken from the outside interface of my NAT/router:

http:://www.netbsd.org/~martin/ftp_from_raspbian.pcap
this is the working (Linux) one

http:://www.netbsd.org/~martin/ftp_from_netbsd.pcap
http:://www.netbsd.org/~martin/ftp_from_netbsd_2.pcap
http:://www.netbsd.org/~martin/ftp_from_netbsd_3.pcap
various tries from a netbsd system (did I mention reproducable?)

http:://www.netbsd.org/~martin/ftp_from_netbsd_internal.pcap
another try from netbsd, but this time captured on the internal
interface of the NAT.


Any hints?

Martin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2015-01-02 18:17:01 UTC
Permalink
Post by Martin Husemann
This is very reliably 100% reproducable, for me and others (with a totally
different network connection, so it is neither my ISP nor my router).
I have created some packet captures and stared at them, but I can't see
what is wrong - looks like the ftp server is not seeing/acking our last
packet, but *why* is that so, and why reproducably?
It looks to me like the packet that isn't being ack'd may actually be
going in the other direction, from the server to the client. It is
always this one, the same in every trace:

06:30:36.457041 IP (tos 0x0, ttl 50, id 27541, offset 0, flags [DF], proto TCP (6), length 61)
ftp.solnet.ch.ftp > ip-176-199-201-17.hsi06.unitymediagroup.de.65151: Flags [P.], cksum 0xe404 (correct), seq 1296:1305, ack 43, win 1040, options [nop,nop,TS val 3171079293 ecr 1], length 9
0x0000: 4500 003d 6b95 4000 3206 89f3 d465 04f4 E..=***@.2....e..
0x0010: b0c7 c911 0015 fe7f e09e abf3 1996 ee58 ...............X
0x0020: 8018 0410 e404 0000 0101 080a bd02 d47d ...............}
0x0030: 0000 0001 3231 3120 456e 640d 0a ....211.End..
06:30:43.504546 IP (tos 0x0, ttl 50, id 27553, offset 0, flags [DF], proto TCP (6), length 61)
ftp.solnet.ch.ftp > ip-176-199-201-17.hsi06.unitymediagroup.de.65151: Flags [P.], cksum 0xc87c (correct), seq 1296:1305, ack 43, win 1040, options [nop,nop,TS val 3171086341 ecr 1], length 9
0x0000: 4500 003d 6ba1 4000 3206 89e7 d465 04f4 E..=***@.2....e..
0x0010: b0c7 c911 0015 fe7f e09e abf3 1996 ee58 ...............X
0x0020: 8018 0410 c87c 0000 0101 080a bd02 f005 .....|..........
0x0030: 0000 0001 3231 3120 456e 640d 0a ....211.End..
06:30:57.402201 IP (tos 0x0, ttl 50, id 27811, offset 0, flags [DF], proto TCP (6), length 61)
ftp.solnet.ch.ftp > ip-176-199-201-17.hsi06.unitymediagroup.de.65151: Flags [P.], cksum 0x9234 (correct), seq 1296:1305, ack 43, win 1040, options [nop,nop,TS val 3171100237 ecr 1], length 9
0x0000: 4500 003d 6ca3 4000 3206 88e5 d465 04f4 E..=***@.2....e..
0x0010: b0c7 c911 0015 fe7f e09e abf3 1996 ee58 ...............X
0x0020: 8018 0410 9234 0000 0101 080a bd03 264d .....4........&M
0x0030: 0000 0001 3231 3120 456e 640d 0a ....211.End..

In the Linux trace the same packet is sent, with the same sequence offset,
but is immediately ack'd by the client.

When this is going on with NetBSD are any of the netstat -s error counters
incrementing? And is the IP/TCP checksum verification being done in software or
is the device hardware doing it instead? If it is being done in hardware my
first guess, unencumbered by any facts or knowledge, would be that the checksum
hardware could be buggy and something about this packet is exercising the bug.

Dennis Ferguson
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mouse
2015-01-02 18:55:11 UTC
Permalink
Post by Dennis Ferguson
It looks to me like the packet that isn't being ack'd may actually be
going in the other direction, from the server to the client. It is
always this one, the same in every trace: [...]
I note that the packet in question is relatively small. I once ran
into a problem (on a very different system - mt.Xinu 4.3+NFS on a VAX)
where the hardware simply couldn't receive sufficiently small packets.
In that case, this manifested as an NFS read hanging under obscure
circumstances; perhaps something related is what's going wrong here?

If you have control over the gateway, maybe try hacking on its kernel
to make it pad packets more? A 61-octet IP payload means the Ethernet
packet is 75 octets (plus the FCS), which _should_ be large enough, but
maybe something's broken?

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Martin Husemann
2015-01-02 18:58:34 UTC
Permalink
Post by Dennis Ferguson
It looks to me like the packet that isn't being ack'd may actually be
going in the other direction, from the server to the client. It is
06:30:36.457041 IP (tos 0x0, ttl 50, id 27541, offset 0, flags [DF], proto TCP (6), length 61)
ftp.solnet.ch.ftp > ip-176-199-201-17.hsi06.unitymediagroup.de.65151: Flags [P.], cksum 0xe404 (correct), seq 1296:1305, ack 43, win 1040, options [nop,nop,TS val 3171079293 ecr 1], length 9
0x0010: b0c7 c911 0015 fe7f e09e abf3 1996 ee58 ...............X
0x0020: 8018 0410 e404 0000 0101 080a bd02 d47d ...............}
0x0030: 0000 0001 3231 3120 456e 640d 0a ....211.End..
Ok, I see this packet show up in a tcpdump on the awge interface, so
should be able to trace it further. This is (currently) all software
checksums, and since the packet made it to the bpf_mtap, it can't be
hardware. I don't see an ACK to this packet in the local trace, so I guess
it realy is the receiving side somewhere further up in the stack that
drops it.

netstat -s below...

I wonder where the PMTUD blackhole comes from (MTU on awge is 1500,
just like on the other machines here that can do the test without
adding a PMTUD blackhole).

Martin

icmp:
0 calls to icmp_error
0 errors not generated because old message was icmp
Output histogram:
echoreply: 1
0 messages with bad code fields
0 messages < minimum length
0 bad checksums
0 messages with bad length
0 multicast echo requests ignored
0 multicast timestamp requests ignored
Input histogram:
echo: 1
photuris: 1096454688104512
1 message response generated
0 path MTU changes
igmp:
1 message received
0 messages received with too few bytes
0 messages received with bad checksum
1 membership query received
0 membership queries received with invalid field(s)
0 membership reports received
0 membership reports received with invalid field(s)
0 membership reports received for groups to which we belong
0 membership reports sent
tcp:
250 packets sent
170 data packets (33646 bytes)
0 data packets (0 bytes) retransmitted
70 ack-only packets (197 delayed)
0 URG only packets
0 window probe packets
1 window update packet
9 control packets
0 send attempts resulted in self-quench
284 packets received
163 acks (for 33649 bytes)
1 duplicate ack
0 acks for unsent data
250 packets (29579 bytes) received in-sequence
0 completely duplicate packets (0 bytes)
0 old duplicate packets
0 packets with some dup. data (0 bytes duped)
0 out-of-order packets (0 bytes)
0 packets (0 bytes) of data after window
0 window probes
2 window update packets
0 packets received after close
9 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
2 connection requests
2 connection accepts
4 connections established (including accepts)
7 connections closed (including 0 drops)
0 embryonic connections dropped
0 delayed frees of tcpcb
165 segments updated rtt (of 161 attempts)
5 retransmit timeouts
0 connections dropped by rexmit timeout
0 persist timeouts (resulting in 0 dropped connections)
0 keepalive timeouts
0 keepalive probes sent
0 connections dropped by keepalive
0 correct ACK header predictions
100 correct data packet header predictions
4 PCB hash misses
0 dropped due to no socket
0 connections drained due to memory shortage
1 PMTUD blackhole detected
0 bad connection attempts
2 SYN cache entries added
0 hash collisions
2 completed
0 aborted (no space to build PCB)
0 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
0 dropped due to RST
0 dropped due to ICMP unreachable
2 delayed free of SYN cache entries
0 SYN,ACKs retransmitted
0 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
0 packets with bad signature
0 packets with good signature
0 successful ECN handshakes
0 packets with ECN CE bit
0 packets ECN ECT(0) bit
udp:
16 datagrams received
0 with incomplete header
0 with bad data length field
0 with bad checksum
0 dropped due to no socket
2 broadcast/multicast datagrams dropped due to no socket
0 dropped due to full socket buffers
14 delivered
5 PCB hash misses
14 datagrams output
ip:
303 total packets received
0 bad header checksums
0 with size smaller than minimum
0 with data size < data length
0 with length > max ip packet size
0 with header length < data size
0 with data length < header length
0 with bad options
0 with incorrect version number
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped (out of ipqent)
0 malformed fragments dropped
0 fragments dropped after timeout
0 packets reassembled ok
302 packets for this host
0 packets for unknown/unsupported protocol
0 packets forwarded (0 packets fast forwarded)
1 packet not forwardable
0 redirects sent
0 packets no matching gif found
265 packets sent from this host
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
0 output packets discarded due to no route
0 output datagrams fragmented
0 fragments created
0 datagrams that can't be fragmented
0 datagrams with bad address in header
ip6:
6 total packets received
0 with size smaller than minimum
0 with data size < data length
0 with bad options
0 with incorrect version number
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
0 fragments that exceeded limit
0 packets reassembled ok
6 packets for this host
0 packets forwarded
0 packets fast forwarded
0 fast forward flows
0 packets not forwardable
0 redirects sent
15 packets sent from this host
0 packets sent with fabricated ip header
0 output packets dropped due to no bufs, etc.
0 output packets discarded due to no route
0 output datagrams fragmented
0 fragments created
0 datagrams that can't be fragmented
0 packets that violated scope rules
0 multicast packets which we don't join
Input packet histogram:
UDP: 3
ICMP6: 3
Mbuf statistics:
0 one mbufs
6 one ext mbufs
0 two or more ext mbufs
0 packets whose headers are not continuous
0 tunneling packets that can't find gif
0 packets discarded due to too many headers
0 failures of source address selection
3 forward cache hit
2 forward cache miss
icmp6:
0 calls to icmp6_error
0 errors not generated because old message was icmp6 or so
0 errors not generated because of rate limitation
Output packet histogram:
multicast listener report: 8
router solicitation: 1
neighbor solicitation: 3
0 messages with bad code fields
0 messages < minimum length
0 bad checksums
0 messages with bad length
Input packet histogram:
router advertisement: 1
neighbor advertisement: 2
Histogram of error messages to be generated:
0 no route
0 administratively prohibited
0 beyond scope
0 address unreachable
0 port unreachable
0 packet too big
0 time exceed transit
0 time exceed reassembly
0 erroneous header field
0 unrecognized next header
0 unrecognized option
0 redirect
0 unknown
0 message responses generated
0 messages with too many ND options
0 messages with bad ND options
0 bad neighbor solicitation messages
0 bad neighbor advertisement messages
0 bad router solicitation messages
0 bad router advertisement messages
0 router advertisement routes dropped
0 bad redirect messages
0 path MTU changes
tcp6:
250 packets sent
170 data packets (33646 bytes)
0 data packets (0 bytes) retransmitted
70 ack-only packets (197 delayed)
0 URG only packets
0 window probe packets
1 window update packet
9 control packets
0 send attempts resulted in self-quench
284 packets received
163 acks (for 33649 bytes)
1 duplicate ack
0 acks for unsent data
250 packets (29579 bytes) received in-sequence
0 completely duplicate packets (0 bytes)
0 old duplicate packets
0 packets with some dup. data (0 bytes duped)
0 out-of-order packets (0 bytes)
0 packets (0 bytes) of data after window
0 window probes
2 window update packets
0 packets received after close
9 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
2 connection requests
2 connection accepts
4 connections established (including accepts)
7 connections closed (including 0 drops)
0 embryonic connections dropped
0 delayed frees of tcpcb
165 segments updated rtt (of 161 attempts)
5 retransmit timeouts
0 connections dropped by rexmit timeout
0 persist timeouts (resulting in 0 dropped connections)
0 keepalive timeouts
0 keepalive probes sent
0 connections dropped by keepalive
0 correct ACK header predictions
100 correct data packet header predictions
4 PCB hash misses
0 dropped due to no socket
0 connections drained due to memory shortage
1 PMTUD blackhole detected
0 bad connection attempts
2 SYN cache entries added
0 hash collisions
2 completed
0 aborted (no space to build PCB)
0 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
0 dropped due to RST
0 dropped due to ICMP unreachable
2 delayed free of SYN cache entries
0 SYN,ACKs retransmitted
0 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
0 packets with bad signature
0 packets with good signature
0 successful ECN handshakes
0 packets with ECN CE bit
0 packets ECN ECT(0) bit
udp6:
3 datagrams received
0 with incomplete header
0 with bad data length field
0 with bad checksum
0 with no checksum
0 dropped due to no socket
0 multicast datagrams dropped due to no socket
0 dropped due to full socket buffers
3 delivered
3 datagrams output
pim6:
0 messages received
0 messages received with too few bytes
0 messages received with bad checksum
0 messages received with bad version
0 registers received
0 bad registers received
0 registers sent
rip6:
0 messages received
0 checksum calculations on inbound
0 messages with bad checksum
0 messages dropped due to no socket
0 multicast messages dropped due to no socket
0 messages dropped due to full socket buffers
0 delivered
0 datagrams output
arp:
4 packets sent
0 reply packets
4 request packets
13 packets received
3 reply packets
10 valid request packets
10 broadcast/multicast packets
0 packets with unknown protocol type
0 packets with bad (short) length
0 packets with null target IP address
0 packets with null source IP address
0 could not be mapped to an interface
0 packets sourced from a local hardware address
0 packets with a broadcast source hardware address
0 duplicates for a local IP address
0 attempts to overwrite a static entry
0 packets received on wrong interface
0 entrys overwritten
0 changes in hardware address length
3 packets deferred pending ARP resolution
3 sent
0 dropped
0 failures to allocate llinfo

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2015-01-02 19:03:20 UTC
Permalink
<...>
Post by Martin Husemann
284 packets received
<...>
Post by Martin Husemann
9 discarded for bad checksums
This one is suspicious. Does this increment when the ftp connection hangs?

Dennis Ferguson

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Martin Husemann
2015-01-02 19:23:33 UTC
Permalink
Post by Dennis Ferguson
Post by Martin Husemann
9 discarded for bad checksums
This one is suspicious. Does this increment when the ftp connection hangs?
Yes, it does. Wouldn't those dropped packets be in the local capture though?
When I ask wireshark to validate the tcp checksums, there are no failures
in the local capture.

Martin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2015-01-02 19:45:16 UTC
Permalink
Post by Martin Husemann
Post by Dennis Ferguson
Post by Martin Husemann
9 discarded for bad checksums
This one is suspicious. Does this increment when the ftp connection hangs?
Yes, it does. Wouldn't those dropped packets be in the local capture though?
When I ask wireshark to validate the tcp checksums, there are no failures
in the local capture.
tcpdump thinks those packets have correct checksums too, so the
next guess might be that the software checksum function being used
in the packet processing path to check the checksum has a bug. It
might also be related to how the packet ends up being stored in its
mbuf.

A decade ago I would never have guessed that software checksum bugs
could happen without being quickly noticed, but now the prevalence
of hardware checksum checks means the software doesn't get exercised
as much as it once did so I'm now always a bit suspicious of it (as
well as the hardware checksums, since there's a big variety of hardware).

Dennis Ferguson
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Martin Husemann
2015-01-02 20:00:14 UTC
Permalink
Post by Dennis Ferguson
tcpdump thinks those packets have correct checksums too, so the
next guess might be that the software checksum function being used
in the packet processing path to check the checksum has a bug. It
might also be related to how the packet ends up being stored in its
mbuf.
I disabled all hardware checksums on another arm SoC and can't reproduce
that problem there. I also do not have the special cortex/NEON checksum
enabled (no option NEON_IN_CKSUM).

Martin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2015-01-02 20:17:44 UTC
Permalink
Post by Martin Husemann
Post by Dennis Ferguson
tcpdump thinks those packets have correct checksums too, so the
next guess might be that the software checksum function being used
in the packet processing path to check the checksum has a bug. It
might also be related to how the packet ends up being stored in its
mbuf.
I disabled all hardware checksums on another arm SoC and can't reproduce
that problem there. I also do not have the special cortex/NEON checksum
enabled (no option NEON_IN_CKSUM).
Is there something else strange about the CPU that is failing? Is
it big-endian byte order?

Dennis Ferguson

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Martin Husemann
2015-01-02 20:20:12 UTC
Permalink
Post by Dennis Ferguson
Is there something else strange about the CPU that is failing? Is
it big-endian byte order?
It happens with both big and little endian.

cpu0 at mainbus0 core 0: 960 MHz Cortex-A7 r0p4 (Cortex V7A core)
cpu0: DC enabled IC enabled WB disabled EABT branch prediction enabled
cpu0: isar: [0]=0x2101110 [1]=0x13112111 [2]=0x21232041 [3]=0x11112131, [4]=0x10011142, [5]=0
cpu0: mmfr: [0]=0x10101105 [1]=0x40000000 [2]=0x1240000 [3]=0x2102211
cpu0: pfr: [0]=0x1131 [1]=0x11011
cpu0: 32KB/32B 2-way L1 VIPT Instruction cache
cpu0: 32KB/64B 4-way write-back-locking-C L1 PIPT Data cache
cpu0: 256KB/64B 8-way write-through L2 PIPT Unified cache
vfp0 at cpu0: NEON MPE (VFP 3.0+), rounding, NaN propagation, denormals
vfp0: mvfr: [0]=0x10110222 [1]=0x11111111


Martin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Swindells
2015-01-02 20:25:38 UTC
Permalink
Post by Martin Husemann
Post by Dennis Ferguson
tcpdump thinks those packets have correct checksums too, so the
next guess might be that the software checksum function being used
in the packet processing path to check the checksum has a bug. It
might also be related to how the packet ends up being stored in its
mbuf.
I disabled all hardware checksums on another arm SoC and can't reproduce
that problem there. I also do not have the special cortex/NEON checksum
enabled (no option NEON_IN_CKSUM).
There have been problems seen recently using software checksums on
amd64 and sparc64.

These cases were hard to reproduce and caused panics, maybe you have
found a better way to track down the same bug.

Robert Swindells


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2015-01-02 21:04:52 UTC
Permalink
Post by Martin Husemann
Post by Dennis Ferguson
Is there something else strange about the CPU that is failing? Is
it big-endian byte order?
It happens with both big and little endian.
Ah, I guess I should have asked, is the kernel which exhibits a
problem running big-endian or little-endian? The checksum function
which may be failing has both endian dependencies and a bunch of corner
cases related to the alignment of the packet data in memory. I was
wondering which branch of the endian #ifdefs needs to be stared at.

Dennis Ferguson

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Joerg Sonnenberger
2015-01-05 18:06:00 UTC
Permalink
Post by Martin Husemann
A strange network problem, reproducable on netbsd-current when using an
ARM device with awge network interface (basically all Allwinner boards)
I had a similar problem when connection to my laptop. Disabling
CPU_IN_CKSUM fixes that, so it certainly sounds like a problem with the
ARM specific checksumming rountines.

Joerg

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Martin Husemann
2015-01-05 21:48:55 UTC
Permalink
Post by Martin Husemann
Simply trying to get anything from one of the KDE mirror sites fails
ftp ftp://ftp.solnet.ch/mirror/KDE/stable/
Connected to ftp.solnet.ch.
220- __ _ _ _ _
..
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
421 Service not available, remote server timed out. Connection closed.
I can confirm what joerg said: a kernel with "no options CPU_IN_CKSUM"
works just fine:

220- ... powered by FreeBSD!
220
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
200 Switching to Binary mode.
250 Directory successfully changed.
550 Failed to change directory.
221 Goodbye.


Martin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2015-01-05 22:47:29 UTC
Permalink
Post by Joerg Sonnenberger
Post by Martin Husemann
A strange network problem, reproducable on netbsd-current when using an
ARM device with awge network interface (basically all Allwinner boards)
I had a similar problem when connection to my laptop. Disabling
CPU_IN_CKSUM fixes that, so it certainly sounds like a problem with the
ARM specific checksumming rountines.
It is also possible to surmise the corner case in the checksum function
that a 61 byte IP packet might be exercising.

- The ethernet driver appears to receive an ethernet frame into
an mcluster aligned with the start of the mcluster. This leaves
the IP headers in memory with 2 byte (but not 4 byte) address
alignment.

- In ip_input() the alignment is corrected by copying the front
of the packet up into a new mbuf with m_copyup(). If IPv6 is
configured m_copyup() will copy the first 60 bytes (max_protohdr)
of the packet into the new mbuf.

- This leaves the 61 IP byte packet spread across 2 mbufs, the first
with 60 bytes aligned to a 4-byte address and the second with a
single byte located in memory with 2-byte alignment. The latter
is a significant corner case for a function trying to do the
computation in word-sized, word-aligned chunks.

That said, I can't see a problem with the arm checksum code by eye and
I'm not in a position to test for this at the moment.

Dennis Ferguson
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Christos Zoulas
2015-01-05 23:36:09 UTC
Permalink
Post by Martin Husemann
Simply trying to get anything from one of the KDE mirror sites fails
ftp ftp://ftp.solnet.ch/mirror/KDE/stable/
Connected to ftp.solnet.ch.
220- __ _ _ _ _
..
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
421 Service not available, remote server timed out. Connection closed.
I've moved joerg's cpu_in_cksum regress in_cksum test to atf. It seems to be
broken on arm... Give it a try.

christos


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Masanobu SAITOH
2015-01-06 00:07:26 UTC
Permalink
Post by Dennis Ferguson
Post by Joerg Sonnenberger
Post by Martin Husemann
A strange network problem, reproducable on netbsd-current when using an
ARM device with awge network interface (basically all Allwinner boards)
I had a similar problem when connection to my laptop. Disabling
CPU_IN_CKSUM fixes that, so it certainly sounds like a problem with the
ARM specific checksumming rountines.
It is also possible to surmise the corner case in the checksum function
that a 61 byte IP packet might be exercising.
- The ethernet driver appears to receive an ethernet frame into
an mcluster aligned with the start of the mcluster. This leaves
the IP headers in memory with 2 byte (but not 4 byte) address
alignment.
- In ip_input() the alignment is corrected by copying the front
of the packet up into a new mbuf with m_copyup(). If IPv6 is
configured m_copyup() will copy the first 60 bytes (max_protohdr)
of the packet into the new mbuf.
- This leaves the 61 IP byte packet spread across 2 mbufs, the first
with 60 bytes aligned to a 4-byte address and the second with a
single byte located in memory with 2-byte alignment. The latter
is a significant corner case for a function trying to do the
computation in word-sized, word-aligned chunks.
That said, I can't see a problem with the arm checksum code by eye and
I'm not in a position to test for this at the moment.
Dennis Ferguson
When I saw the first mail, I remembered the following PR:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=46898

Perhaps the problem is not related to this PR's bug though...
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

* 英語 - 自動検出
* 英語
* 日本語

* 英語
* 日本語

<javascript:void(0);> <#>

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Nick Hudson
2015-02-17 10:09:38 UTC
Permalink
Post by Christos Zoulas
Post by Martin Husemann
Simply trying to get anything from one of the KDE mirror sites fails
ftp ftp://ftp.solnet.ch/mirror/KDE/stable/
Connected to ftp.solnet.ch.
220- __ _ _ _ _
..
331 Please specify the password.
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
421 Service not available, remote server timed out. Connection closed.
I've moved joerg's cpu_in_cksum regress in_cksum test to atf. It seems to be
broken on arm... Give it a try.
christos
For the record... this is now fixed with
sys/arch/arm/arm/cpu_in_cksum.S:1.11

Nick

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...