nfe(4) hardware checksum support

Discussion:

(too old to reply)

Quentin Garnier

2007-02-04 04:31:49 UTC

I applied the patch today and built a new kernel from the -current sources
of today. Unfortunately, the patch doesn't seem to fix the problem.
When accessing two NFS servers simultaneously, packet reception stops
every now and then for about 30 seconds at a time.

RX is definitely the problem. When I experience such stalls (which seem
to happen as soon as a high packet rate comes in), I can still see
packets coming out through nfe. I don't remember if I checked with
tcpdump, but pinging the broadcast address does blink the whole switch,
so I'm quite positive.

--
Quentin Garnier - ***@cubidou.net - ***@NetBSD.org
"You could have made it, spitting out benchmarks
Owe it to yourself not to fail"
Amplifico, Spitting Out Benchmarks, Hometakes Vol. 2, 2005.

Jukka Marin

2007-02-04 08:51:24 UTC

Permalink

Post by Quentin Garnier
RX is definitely the problem. When I experience such stalls (which seem
to happen as soon as a high packet rate comes in), I can still see
packets coming out through nfe. I don't remember if I checked with
tcpdump, but pinging the broadcast address does blink the whole switch,
so I'm quite positive.

I can confirm this - RX is the problem and transmitting packets during stall
does not re-enable reception.

If it's caused by some race condition, how about the attached one?

Should it be applied on top of your previous patch or to clean NetBSD
sources?

BTW, which is your port, i386 or amd64?

i386. I have run amd64 on the system, too, but not long enough to be able
to tell if the nfe driver works better or worse there (had too many problems
with XFree and applications with amd64).

-jm

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Jared D. McNeill

2007-02-04 15:56:57 UTC

Permalink

I applied the patch today and built a new kernel from the -current
sources
of today. Unfortunately, the patch doesn't seem to fix the problem.
When accessing two NFS servers simultaneously, packet reception stops
every now and then for about 30 seconds at a time.

Hmm, I'm afraid it's difficult to debug it only by code inspection,
but could you check which TX or RX could cause problem (by ttcp etc.)
and interrupt/network statistics during stall (by vmstat -i or
netstat -i etc.)?
If FreeBSD or OpenBSD have the similar problem, maybe
we need chip docs (there are several magic in the source).

Not sure if it's related, but you might want to have a look at the
DragonFly modifications to the nfe driver to prevent watchdog timeouts:

http://www.dragonflybsd.org/cvsweb/src/sys/dev/netif/nfe/if_nfe.c?
rev=1.1&content-type=text/x-cvsweb-markup

Cheers,
Jared

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de