Discussion:
wm(4)
(too old to reply)
Robert Swindells
2015-10-28 16:14:03 UTC
Permalink
Anyone else having problems with wm(4) in current ?

Works fine in a kernel from Oct 5, doesn't do anything in latest version.

The dmesg lines are:

wm0 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 17
wm0: PCI-Express bus
wm0: 2048 words FLASH, version 1.8.0, Image Unique ID 0000ffff
wm0: Ethernet address 68:05:ca:28:b1:c7
makphy0 at wm0 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1

Robert Swindells

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Swindells
2015-10-28 20:05:38 UTC
Permalink
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
Some more info, it is sending packets but not responding to incoming
ones.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
SAITOH Masanobu
2015-10-28 20:30:11 UTC
Permalink
Post by Robert Swindells
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
Some more info, it is sending packets but not responding to incoming
ones.
Could you show me the output of "ifconfig -v wm0"?
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mike Pumford
2015-10-28 21:00:28 UTC
Permalink
Post by Robert Swindells
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
Some more info, it is sending packets but not responding to incoming
ones.
I have some intermittent issues on NetBSD 7.0-RELEASE but only on one
machine and pretty much all my x86 systems have wm adapters of one sort
or another. Occasionally it will fail to get an address via DHCP. Next
time it goes odd I'll grab the ifconfig -v output and see if I can grab
some packet dumps from the DHCP server side. For reference the systems
which have no problems at all ever:

NetBSD 7.0/amd64:
wm0 at pci4 dev 3 function 0: Intel i82541PI 1000BASE-T Ethernet (rev. 0x05)
wm0: interrupting at ioapic0 pin 22
wm0: 32-bit 33MHz PCI bus
wm0: 64 words (6 address bits) MicroWire EEPROM
wm0: Ethernet address 00:0e:0c:72:67:5a
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0

NetBSD 6.1-STABLE/i386:
wm0 at pci5 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 19
wm0: PCI-Express bus
wm0: 2048 words (8 address bits) SPI EEPROM
wm0: Ethernet address 00:00:24:d0:a1:fc
ukphy0 at wm0 phy 1: OUI 0x000ac2, model 0x000b, rev. 1

And the intermittently problematic system:
NetBSD 7.0/amd64 (exactly the same build and kernel as the first system)

wm0 at pci0 dev 25 function 0: I218 V Ethernet Connection (rev. 0x00)
wm0: interrupting at ioapic0 pin 20
wm0: PCI-Express bus
wm0: 2048 words FLASH
wm0: Ethernet address 10:c3:7b:95:20:ed
ihphy0 at wm0 phy 2: i217 10/100/1000 media interface, rev. 5


This is the ifconfig -v from the intermittent system:
wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
ec_enabled=0
address: 10:c3:7b:95:20:ed
media: Ethernet autoselect (1000baseT
full-duplex,flowcontrol,rxpause,txpause)
status: active
input: 99078965 packets, 12756425569 bytes, 1639632 multicasts
output: 19942431 packets, 225995936552 bytes, 194 multicasts, 1
error
inet 192.168.1.2 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::12c3:7bff:fe95:20ed%wm0 prefixlen 64 temporary
scopeid 0x1
inet6 2001:8b0:84:1:12c3:7bff:fe95:20ed prefixlen 64

I'm guessing the one error in those stat counters corresponds to this in
my dmesg:
wm0: device timeout (txfree 4068 txsfree 55 txnext 690)

As I say next time I see a failure to acquire an address I'll acquire
the ifconfig -v output again. On this particular boot the system got its
DHCP allocation fine and has been running reliably ever since apart from
that one glitch.

Mike

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
SAITOH Masanobu
2015-10-28 21:35:35 UTC
Permalink
Post by Mike Pumford
Post by Robert Swindells
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
Some more info, it is sending packets but not responding to incoming
ones.
wm0 at pci4 dev 3 function 0: Intel i82541PI 1000BASE-T Ethernet (rev. 0x05)
wm0: interrupting at ioapic0 pin 22
wm0: 32-bit 33MHz PCI bus
wm0: 64 words (6 address bits) MicroWire EEPROM
wm0: Ethernet address 00:0e:0c:72:67:5a
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
wm0 at pci5 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 19
wm0: PCI-Express bus
wm0: 2048 words (8 address bits) SPI EEPROM
wm0: Ethernet address 00:00:24:d0:a1:fc
ukphy0 at wm0 phy 1: OUI 0x000ac2, model 0x000b, rev. 1
NetBSD 7.0/amd64 (exactly the same build and kernel as the first system)
wm0 at pci0 dev 25 function 0: I218 V Ethernet Connection (rev. 0x00)
wm0: interrupting at ioapic0 pin 20
wm0: PCI-Express bus
wm0: 2048 words FLASH
wm0: Ethernet address 10:c3:7b:95:20:ed
ihphy0 at wm0 phy 2: i217 10/100/1000 media interface, rev. 5
wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
ec_enabled=0
address: 10:c3:7b:95:20:ed
media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause)
status: active
input: 99078965 packets, 12756425569 bytes, 1639632 multicasts
output: 19942431 packets, 225995936552 bytes, 194 multicasts, 1 error
inet 192.168.1.2 netmask 0xffffff00 broadcast 192.168.1.255
inet6 fe80::12c3:7bff:fe95:20ed%wm0 prefixlen 64 temporary scopeid 0x1
inet6 2001:8b0:84:1:12c3:7bff:fe95:20ed prefixlen 64
wm0: device timeout (txfree 4068 txsfree 55 txnext 690)
This device timeout might be related to PR#40981.

http://gnats.netbsd.org/40981
Post by Mike Pumford
As I say next time I see a failure to acquire an address I'll acquire the ifconfig -v output again. On this particular boot the system got its DHCP allocation fine and has been running reliably ever since apart from that one glitch.
Mike
By the way, wm(4) has an small problem. When we use "ifconfig wm0 up" on
some chips with copper media, the following sequence is observed:

0) link goes up

1) after 1 second, it goes down

2) and a few seconds later, link goes up again

I fixed this problem yesterday. The change also fixes a problem
that "some" ICH8+82566 systems does link up with 100M in above 2)
with 1G switch with "ifconfg wm0 auto". I'll commit the change
in a few days.
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mike Pumford
2015-10-31 15:31:07 UTC
Permalink
Post by SAITOH Masanobu
Post by Mike Pumford
wm0: device timeout (txfree 4068 txsfree 55 txnext 690)
This device timeout might be related to PR#40981.
http://gnats.netbsd.org/40981
Looks plausible. So far its a one off event but it is one I've seen
before on another system. Anything I can do to help debug?
Post by SAITOH Masanobu
By the way, wm(4) has an small problem. When we use "ifconfig wm0 up" on
0) link goes up
1) after 1 second, it goes down
2) and a few seconds later, link goes up again
I fixed this problem yesterday. The change also fixes a problem
that "some" ICH8+82566 systems does link up with 100M in above 2)
with 1G switch with "ifconfg wm0 auto". I'll commit the change
in a few days.
Ah. That little glitch is exactly what causes DHCP to go wrong on my
system and now I think about it some more in that scenario I do see
frames being transmitted but none being received. Is this fix likely to
be pulled up into 7.x?

Mike


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Masanobu SAITOH
2015-11-02 10:20:33 UTC
Permalink
Post by SAITOH Masanobu
Post by Mike Pumford
wm0: device timeout (txfree 4068 txsfree 55 txnext 690)
This device timeout might be related to PR#40981.
http://gnats.netbsd.org/40981
Looks plausible. So far its a one off event but it is one I've seen before on another system. Anything I can do to help debug?
Thank you but no thank you because I can reproduce this problem
by myself.
Post by SAITOH Masanobu
By the way, wm(4) has an small problem. When we use "ifconfig wm0 up" on
0) link goes up
1) after 1 second, it goes down
2) and a few seconds later, link goes up again
I fixed this problem yesterday. The change also fixes a problem
that "some" ICH8+82566 systems does link up with 100M in above 2)
with 1G switch with "ifconfg wm0 auto". I'll commit the change
in a few days.
Ah. That little glitch is exactly what causes DHCP to go wrong on my system and now I think about it some more in that scenario I do see frames being transmitted but none being received. Is this fix likely to be pulled up into 7.x?
Mike
Those changes will be pulled up to netbsd-7 after waiting
a few weeks in case I might added a new bug.

Thanks.
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
SAITOH Masanobu
2015-10-28 22:08:15 UTC
Permalink
Post by Robert Swindells
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
Some more info, it is sending packets but not responding to incoming
ones.
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
wm0 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 17
Not MSI-X but INTx? Are you using Xen dom0 or non x86 machine?
Post by Robert Swindells
wm0: PCI-Express bus
wm0: 2048 words FLASH, version 1.8.0, Image Unique ID 0000ffff
wm0: Ethernet address 68:05:ca:28:b1:c7
makphy0 at wm0 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
Robert Swindells
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Swindells
2015-10-28 23:42:02 UTC
Permalink
Post by SAITOH Masanobu
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
wm0 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 17
Not MSI-X but INTx? Are you using Xen dom0 or non x86 machine?
It is an AMD amd64 machine, not running Xen.

I don't know whether it should support MSI-X or not, I am running a
custom kernel on it which did not contain wmimsi and acpiwmi but adding
them makes no difference. Is there anything else that should be added ?

Output of ifconfig -v is:

wm0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
ec_enabled=0
address: 68:05:ca:28:b1:c7
media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause)
status: active
input: 5 packets, 1480 bytes
output: 27 packets, 4156 bytes, 11 multicasts
inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255
inet6 fe80::6a05:caff:fe28:b1c7%wm0 prefixlen 64 scopeid 0x2

Also tried with all offloading options turned off, no change.

Running tcpdump on another machine shows it sending ARP requests.

Version 1.350 of if_wm.c works.

Robert Swindells

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Masanobu SAITOH
2015-10-29 07:25:38 UTC
Permalink
Post by Robert Swindells
Post by SAITOH Masanobu
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
wm0 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 17
Not MSI-X but INTx? Are you using Xen dom0 or non x86 machine?
It is an AMD amd64 machine, not running Xen.
There are two other cases that it fallback to INTx:

a) Old or broken PCI hostbridge. There is a blacklist in x86/pci/pci_machdep.c
(pci_msi_quirk_tbl).

b) A lot of interrupts are assinged to cpu0. On x86, interrupt sources are
managed by cpu_info->ci_isources[]. It's fixed size. Legacy interrupt can
share one entry but MSI/MSI-X can't.

I suspect your machine is in b).

And I've commit the change which should fix your problem now. Could you test
the latest if_wm.c?
Post by Robert Swindells
Module Name: src
Committed By: msaitoh
Date: Thu Oct 29 07:24:01 UTC 2015
src/sys/dev/pci: if_wm.c
Fix a bug that the multiqueue setting is done in a multiqueue capabile
chip but can't use MSI-X on a machine. In that case, only one queue must
This change should be fix a problem which was reported by Robert Swindells.
cvs rdiff -u -r1.374 -r1.375 src/sys/dev/pci/if_wm.c
Regards.
Post by Robert Swindells
I don't know whether it should support MSI-X or not, I am running a
custom kernel on it which did not contain wmimsi and acpiwmi but adding
them makes no difference. Is there anything else that should be added ?
wm0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
ec_enabled=0
address: 68:05:ca:28:b1:c7
media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause)
status: active
input: 5 packets, 1480 bytes
output: 27 packets, 4156 bytes, 11 multicasts
inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255
inet6 fe80::6a05:caff:fe28:b1c7%wm0 prefixlen 64 scopeid 0x2
Also tried with all offloading options turned off, no change.
Running tcpdump on another machine shows it sending ARP requests.
Version 1.350 of if_wm.c works.
Robert Swindells
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Robert Swindells
2015-10-29 12:56:44 UTC
Permalink
Post by Masanobu SAITOH
Post by Robert Swindells
Post by SAITOH Masanobu
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
wm0 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 17
Not MSI-X but INTx? Are you using Xen dom0 or non x86 machine?
It is an AMD amd64 machine, not running Xen.
a) Old or broken PCI hostbridge. There is a blacklist in x86/pci/pci_machdep.c
(pci_msi_quirk_tbl).
b) A lot of interrupts are assinged to cpu0. On x86, interrupt sources are
managed by cpu_info->ci_isources[]. It's fixed size. Legacy interrupt can
share one entry but MSI/MSI-X can't.
I suspect your machine is in b).
And I've commit the change which should fix your problem now. Could you test
the latest if_wm.c?
Works fine now, thanks for your help.

The hostbridge isn't listed in the quirks table, it is a RS780.

Robert Swindells


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Masanobu SAITOH
2015-10-30 07:10:41 UTC
Permalink
Hi.
Post by Robert Swindells
Post by Masanobu SAITOH
Post by Robert Swindells
Post by SAITOH Masanobu
Post by Robert Swindells
Anyone else having problems with wm(4) in current ?
Works fine in a kernel from Oct 5, doesn't do anything in latest version.
wm0 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: interrupting at ioapic0 pin 17
Not MSI-X but INTx? Are you using Xen dom0 or non x86 machine?
It is an AMD amd64 machine, not running Xen.
a) Old or broken PCI hostbridge. There is a blacklist in x86/pci/pci_machdep.c
(pci_msi_quirk_tbl).
b) A lot of interrupts are assinged to cpu0. On x86, interrupt sources are
managed by cpu_info->ci_isources[]. It's fixed size. Legacy interrupt can
share one entry but MSI/MSI-X can't.
I suspect your machine is in b).
And I've commit the change which should fix your problem now. Could you test
the latest if_wm.c?
Works fine now, thanks for your help.
You're welcome.
Post by Robert Swindells
The hostbridge isn't listed in the quirks table, it is a RS780.
If you're OK, could you show me the full dmesg and "cpuctl list"?
(not intrctl but cpuctl)
Post by Robert Swindells
% cpuctl list
Num HwId Unbound LWPs Interrupts Last change #Intr
---- ---- ------------ ---------- ------------------------ -----
0 0 online intr Mon Aug 3 12:46:18 2015 25 <====
1 1 online intr Mon Aug 3 12:46:18 2015 0
I'd like to check avobe value.

Thanks in advance.
Post by Robert Swindells
Robert Swindells
--
-----------------------------------------------
SAITOH Masanobu (***@execsw.org
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...