Discussion:
network i/o errors in 3.1 ?
(too old to reply)
Andreas_Hallmann
2007-01-26 08:06:33 UTC
Permalink
Hi,
I upgraded my netbsd-network to 3.1.
Since then I see network input and output errors on my 10MBit/simplex lance ethernet interfaces.
100MBit full duplex hme's are working without input or output errors.
Unfortunatelly I additionally added a 3com superstack 3300 switch, replacing it's predecessor which has stopped working durring the upgrade.

I'm unshure how to interpret this.
What are input/output errors? Yes my question is that basic.
I have a clear figure of colisions, but I don't what is counted by the other counters.
It is said, that output errors indicate an interface going bad, but it is unlikely that all my suns le0 divices are going bad at the same time.

Any Ideas, any pointers?

Thanks AHA
--
NetBSD: If you happen to have any problem with your uptime.


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Manuel Bouyer
2007-01-27 12:22:39 UTC
Permalink
Post by Andreas_Hallmann
Hi,
I upgraded my netbsd-network to 3.1.
Since then I see network input and output errors on my 10MBit/simplex lance ethernet interfaces.
100MBit full duplex hme's are working without input or output errors.
Unfortunatelly I additionally added a 3com superstack 3300 switch, replacing it's predecessor which has stopped working durring the upgrade.
I'm unshure how to interpret this.
What are input/output errors? Yes my question is that basic.
I have a clear figure of colisions, but I don't what is counted by the other counters.
It is said, that output errors indicate an interface going bad, but it is unlikely that all my suns le0 divices are going bad at the same time.
Any Ideas, any pointers?
Reading the sources, input errors can be one:
- received packet larger than max ethernet frame size
- lack of memory for storing the new packet
- various receive error signaled from the chip, but a message is logged as
well

output errors can be:
- device timeout (but then you get a message in logs)
- various transmit errors signaled by the chip, including exessive collisions.
Some of them are logged.

You can try
options LEDEBUG
in your kernel config file (you must run make clean before rebuilding)
to get more informations.
--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Laight
2007-02-07 21:14:38 UTC
Permalink
After an uptime of > 4 days I get along with many collisions lots of
le0: missed packet
I digged the sources but am stil unshure.
Does it meen that thoose 8 recieve buffers are insuficient?
Can I increase the number of recieve buffers for le0,
or are the part of the lance chip?
The maximum number of rx buffers will be 128, the number must be a power of 2.
I don't know exactly how the netbsd driver does things though!

However I suspect you aren't getting interrupts served often enough.
Moreover looking a bit beeper on my dmesg, I wonder why my onboard
le0 is bound to sbus0 and my sbus le1 is bound to ledma0.
Does that meen I have dma support for le1 but not for le0?
No - the AMD Lance chipset always runs in DMA mode. It really expects
to acquire the host bus (from the cpu) in order to access memory.
For the sbus systems the LSI Logic 'DMA' part acts as the normal bus master
for the on-board bus, and relays transactions from the lance onto the sbus.
In order to do burst transfers on the sbus (and to get enough bandwidth)
the DMA part has (IIRC) two 32 byte buffers into which it pre-fetches TX
data and buffer RX data. The caching and read-ahead algorithm is carefully
designed to work with the transfers the lance actually makes.

The DMA chip has interfaces for the lance, scsi, parallel port and the
ROM (I think that is all). There is one on the motherboard that has all
the onboard devices connected, the sbus ethernet card will have one of its
own with the other devices absent.

There are several different version of the DMA chip (and I don't know
which sun motherbaords have which), the early one requires that ethernet
rx buffers be on a 32 byte boundary (which sucks because they then need
a software misaligne4d copy). The DMA2 part handles that ok.

The other problem is one of sbus latency and priorities. Somewhere
there ought to be a register to set the priorities of the sbus slots
master accesses. Usually the motherboard port is high priority, and the
other low priority. Under heavy load (and probably requiring 2 dual cpu
modules) the sbus devices can get starved of accesses to main memory.
This causes the lance to timeout its bus transfers - which exposes some
silicon bugs in the nmos lnace itself :-(
However you aren't seeing those because the system tends to lock solid
on the next access to the lance!
(And I don't remember seeing any of the required code in the netbsd le
or ledma driver.)

David
--
David Laight: ***@l8s.co.uk

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...