Discussion:
gigE negotiation failure with wm0
(too old to reply)
Steven M. Bellovin
2008-09-09 15:44:31 UTC
Permalink
On my laptop (Thinkpad T61 running amd64-current), if I connect it to a
NETGEAR GS608 Gigabit switch, it speaks properly at 1000M bps. If,
however, I connect it to a NETGEAR GS108 ProSafe switch, it can only
negotiate 10M bps, though it will talk at 100M bps if I configure it
manually. Power-cycling the switch doesn't help, nor do switching
cables or ports. Ubuntu 8.04 speaks GigE just fine on the exact same
hardware, so I suspect a driver issue.

Here are the wm0 and phy lines from dmesg:

wm0 at pci0 dev 25 function 0: Intel i82801H (M_AMT) LAN Controller,
rev. 3 wm0: interrupting at ioapic0 pin 20 (irq 11)
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 00:1e:37:18:93:c5
igphy0 at wm0 phy 1: i82566 10/100/1000 media interface, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Young
2008-09-09 16:07:51 UTC
Permalink
Post by Steven M. Bellovin
On my laptop (Thinkpad T61 running amd64-current), if I connect it to a
NETGEAR GS608 Gigabit switch, it speaks properly at 1000M bps. If,
however, I connect it to a NETGEAR GS108 ProSafe switch, it can only
negotiate 10M bps, though it will talk at 100M bps if I configure it
manually. Power-cycling the switch doesn't help, nor do switching
cables or ports. Ubuntu 8.04 speaks GigE just fine on the exact same
hardware, so I suspect a driver issue.
wm0 at pci0 dev 25 function 0: Intel i82801H (M_AMT) LAN Controller,
rev. 3 wm0: interrupting at ioapic0 pin 20 (irq 11)
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 00:1e:37:18:93:c5
igphy0 at wm0 phy 1: i82566 10/100/1000 media interface, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
Steven,

Christoph Egger mentioned this patch to me, yesterday,

http://article.gmane.org/gmane.os.openbsd.cvs/79169

I have not looked into it, so I don't know if it may help your problem
or not.

Dave
--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2008-09-09 20:57:19 UTC
Permalink
On Tue, 9 Sep 2008 11:07:51 -0500
Post by David Young
Post by Steven M. Bellovin
On my laptop (Thinkpad T61 running amd64-current), if I connect it
to a NETGEAR GS608 Gigabit switch, it speaks properly at 1000M
bps. If, however, I connect it to a NETGEAR GS108 ProSafe switch,
it can only negotiate 10M bps, though it will talk at 100M bps if I
configure it manually. Power-cycling the switch doesn't help, nor
do switching cables or ports. Ubuntu 8.04 speaks GigE just fine on
the exact same hardware, so I suspect a driver issue.
wm0 at pci0 dev 25 function 0: Intel i82801H (M_AMT) LAN Controller,
rev. 3 wm0: interrupting at ioapic0 pin 20 (irq 11)
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 00:1e:37:18:93:c5
igphy0 at wm0 phy 1: i82566 10/100/1000 media interface, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
Steven,
Christoph Egger mentioned this patch to me, yesterday,
http://article.gmane.org/gmane.os.openbsd.cvs/79169
I have not looked into it, so I don't know if it may help your problem
or not.
Interesting -- the failing switch is more intended for managed
environments with VLANs, tags, etc.; it might indeed be related. Now
to figure out how to port the patch to NetBSD.

Thanks.


--Steve Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2008-09-09 21:12:20 UTC
Permalink
On Tue, 9 Sep 2008 16:57:19 -0400
Post by Steven M. Bellovin
Post by David Young
Christoph Egger mentioned this patch to me, yesterday,
http://article.gmane.org/gmane.os.openbsd.cvs/79169
I have not looked into it, so I don't know if it may help your
problem or not.
Interesting -- the failing switch is more intended for managed
environments with VLANs, tags, etc.; it might indeed be related. Now
to figure out how to port the patch to NetBSD.
Thanks.
Since the patches were (mostly) to phy routines, I tried booting with
'disable igphy'. It fell back to ukphy and *claims* to have negotiated
GigE. The lights on the switch suggest otherwise, and the speed is
more indicative of a 100BaseT connection. I'll try it again at home,
on a switch with better lights, and will see what happens.

--Steve Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Brad
2008-09-09 21:41:57 UTC
Permalink
On Tue, 9 Sep 2008 11:07:51 -0500
Post by David Young
Post by Steven M. Bellovin
wm0 at pci0 dev 25 function 0: Intel i82801H (M_AMT) LAN Controller,
rev. 3 wm0: interrupting at ioapic0 pin 20 (irq 11)
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 00:1e:37:18:93:c5
igphy0 at wm0 phy 1: i82566 10/100/1000 media interface, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
Steven,
Christoph Egger mentioned this patch to me, yesterday,
http://article.gmane.org/gmane.os.openbsd.cvs/79169
I have not looked into it, so I don't know if it may help your problem
or not.
Dave
This will not help with the problem. Looking at igphy(4) I can see it is
programming DSP override values into the third generation IGP PHY which
it should not. First I would make the code only call the DSP load function
for the first two generations of PHY (represented by a single PHY id). Then
I would also take a look at...

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/e1000/e1000_phy.c

e1000_phy_init_script_igp3() and create another DSP load function for
the third generation PHY and plug in those DSP values.
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2008-09-10 00:20:05 UTC
Permalink
On Tue, 9 Sep 2008 17:41:57 -0400
Post by Steven M. Bellovin
On Tue, 9 Sep 2008 11:07:51 -0500
Post by David Young
Post by Steven M. Bellovin
wm0 at pci0 dev 25 function 0: Intel i82801H (M_AMT) LAN
Controller, rev. 3 wm0: interrupting at ioapic0 pin 20 (irq 11)
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 00:1e:37:18:93:c5
igphy0 at wm0 phy 1: i82566 10/100/1000 media interface, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
Steven,
Christoph Egger mentioned this patch to me, yesterday,
http://article.gmane.org/gmane.os.openbsd.cvs/79169
I have not looked into it, so I don't know if it may help your
problem or not.
Dave
This will not help with the problem. Looking at igphy(4) I can see it
is programming DSP override values into the third generation IGP PHY
which it should not. First I would make the code only call the DSP
load function for the first two generations of PHY (represented by a
single PHY id). Then I would also take a look at...
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/e1000/e1000_phy.c
e1000_phy_init_script_igp3() and create another DSP load function for
the third generation PHY and plug in those DSP values.
Interesting. Running with ukphy at home, I'm happily speaking GigE,
and getting ttcp throughput of >750M bps in one direction (the other
end is i386-current with bge0) and 600M bps in the other. I'll see
what happens tomorrow with the problematic switch. But -- do I lose
anything by using ukphy instead of igphy?


--Steve Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2008-11-11 19:13:57 UTC
Permalink
On Tue, 9 Sep 2008 11:07:51 -0500
Post by David Young
Post by Steven M. Bellovin
On my laptop (Thinkpad T61 running amd64-current), if I connect it
to a NETGEAR GS608 Gigabit switch, it speaks properly at 1000M
bps. If, however, I connect it to a NETGEAR GS108 ProSafe switch,
it can only negotiate 10M bps, though it will talk at 100M bps if I
configure it manually. Power-cycling the switch doesn't help, nor
do switching cables or ports. Ubuntu 8.04 speaks GigE just fine on
the exact same hardware, so I suspect a driver issue.
wm0 at pci0 dev 25 function 0: Intel i82801H (M_AMT) LAN Controller,
rev. 3 wm0: interrupting at ioapic0 pin 20 (irq 11)
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address 00:1e:37:18:93:c5
igphy0 at wm0 phy 1: i82566 10/100/1000 media interface, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
Steven,
Christoph Egger mentioned this patch to me, yesterday,
http://article.gmane.org/gmane.os.openbsd.cvs/79169
I have not looked into it, so I don't know if it may help your problem
or not.
Perhaps coincidentally, the switch failed overnight, and the (nominally
identical) replacement works at GigE speeds...


--Steve Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Kevin Lahey
2009-08-05 20:27:37 UTC
Permalink
On Tue, 9 Sep 2008 17:41:57 -0400
Post by Brad
This will not help with the problem. Looking at igphy(4) I can see it
is programming DSP override values into the third generation IGP PHY
which it should not. First I would make the code only call the DSP
load function for the first two generations of PHY (represented by a
single PHY id). Then I would also take a look at...
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/e1000/e1000_phy.c
e1000_phy_init_script_igp3() and create another DSP load function for
the third generation PHY and plug in those DSP values.
Perusing the 36,000 or so lines of code [*] in the FreeBSD driver, it
looks like the smart speed workarounds only apply to that first
generation igp, and the dsp code is not only igp-specific, but is
really only used when there's no EEPROM. Given all that, is it a
serious problem to just remove the later generation PHYs from this file
altogether and let them fall through to ukphy? That, as they say,
works for me on my T61.

RCS file: /cvsroot/src/sys/dev/mii/igphy.c,v
retrieving revision 1.17
diff -u -r1.17 igphy.c
--- igphy.c 17 Nov 2008 03:04:27 -0000 1.17
+++ igphy.c 5 Aug 2009 20:21:58 -0000
@@ -116,9 +116,6 @@
{ MII_OUI_yyINTEL, MII_MODEL_yyINTEL_IGP01E1000,
MII_STR_yyINTEL_IGP01E1000 },

- { MII_OUI_yyINTEL, MII_MODEL_yyINTEL_I82566,
- MII_STR_yyINTEL_I82566 },
-
{0, 0,
NULL },
};

Kevin
***@patheticgeek.net

[*] Apparently UNIX Version 6 was about 8,200 lines of C and 900 lines
of assembler.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Simon Burge
2009-08-05 22:25:01 UTC
Permalink
Post by Kevin Lahey
Perusing the 36,000 or so lines of code [*] in the FreeBSD driver
[ ... ]
[*] Apparently UNIX Version 6 was about 8,200 lines of C and 900 lines
of assembler.
Never let the facts get in the way of a good story! V6 was about 50,800
lines of assembler and 40,800 lines of C (source + headers). V5 wasn't
that much smaller.

This of course isn't justification for any single driver containing
36,000 line of source, but perhaps we can mark the milestone when the
driver reaches 40,781 lines of code :-)

Cheers,
Simon.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Thor Lancelot Simon
2009-08-06 01:03:25 UTC
Permalink
Post by Kevin Lahey
Perusing the 36,000 or so lines of code [*] in the FreeBSD driver, it
looks like the smart speed workarounds only apply to that first
generation igp, and the dsp code is not only igp-specific, but is
really only used when there's no EEPROM. Given all that, is it a
serious problem to just remove the later generation PHYs from this file
altogether and let them fall through to ukphy? That, as they say,
works for me on my T61.
I think you should check that change in.

Thor

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
SAITOH Masanobu
2009-12-16 05:01:23 UTC
Permalink
Hi, all.
Post by Steven M. Bellovin
On Tue, 9 Sep 2008 17:41:57 -0400
Post by Brad
This will not help with the problem. Looking at igphy(4) I can see it
is programming DSP override values into the third generation IGP PHY
which it should not. First I would make the code only call the DSP
load function for the first two generations of PHY (represented by a
single PHY id). Then I would also take a look at...
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/e1000/e1000_phy.c
e1000_phy_init_script_igp3() and create another DSP load function for
the third generation PHY and plug in those DSP values.
Perusing the 36,000 or so lines of code [*] in the FreeBSD driver, it
looks like the smart speed workarounds only apply to that first
generation igp, and the dsp code is not only igp-specific, but is
really only used when there's no EEPROM. Given all that, is it a
serious problem to just remove the later generation PHYs from this file
altogether and let them fall through to ukphy? That, as they say,
works for me on my T61.
RCS file: /cvsroot/src/sys/dev/mii/igphy.c,v
retrieving revision 1.17
diff -u -r1.17 igphy.c
--- igphy.c 17 Nov 2008 03:04:27 -0000 1.17
+++ igphy.c 5 Aug 2009 20:21:58 -0000
@@ -116,9 +116,6 @@
{ MII_OUI_yyINTEL, MII_MODEL_yyINTEL_IGP01E1000,
MII_STR_yyINTEL_IGP01E1000 },
- { MII_OUI_yyINTEL, MII_MODEL_yyINTEL_I82566,
- MII_STR_yyINTEL_I82566 },
-
{0, 0,
NULL },
};
Kevin
[*] Apparently UNIX Version 6 was about 8,200 lines of C and 900 lines
of assembler.
I've commit the fix for igphy and re-enabled it. Could you verify whether
that bug have been fixed? I've verified that on ICH8 and ICH9.

=================
Module Name: src
Committed By: msaitoh
Date: Wed Dec 16 04:50:36 UTC 2009

Modified Files:
src/sys/dev/mii: igphy.c
src/sys/dev/pci: if_wm.c
Added Files:
src/sys/dev/pci: if_wmvar.h

Log Message:
Ee-enable igphy's 82566 support.
- Patch for the DSP code is only for 8254[17] and we have to apply the
different patches between rev. 1 and rev. 2.
- The workaround for analog fuse is only for 82547 rev. 1.
- The workaround for smartspeed is only for 8254[17]

see http://mail-index.netbsd.org/tech-net/2009/08/05/msg001546.html


To generate a diff of this commit:
cvs rdiff -u -r1.18 -r1.19 src/sys/dev/mii/igphy.c
cvs rdiff -u -r1.181 -r1.182 src/sys/dev/pci/if_wm.c
cvs rdiff -u -r0 -r1.3 src/sys/dev/pci/if_wmvar.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
=================

----------------------------------------------------------
SAITOH Masanobu (***@iij.ad.jp
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Kevin Lahey
2010-02-17 23:07:40 UTC
Permalink
On Wed, 16 Dec 2009 14:01:23 +0900 (JST)
Post by SAITOH Masanobu
I've commit the fix for igphy and re-enabled it. Could you
verify whether that bug have been fixed? I've verified that on
ICH8 and ICH9.
Sorry to take so long to test this!

Unfortunately, I still see issues. I suspend my T61 at home, on a
100Mb network, then resume at work, on a 1Gb network. I see high
packet loss rates and all sorts of weird difficulties until I
"ifconfig wm0 down; ifconfig wm0 up", after which everything is
fine.

If I use ukphy instead of igphy, things just work.

Is there some debugging code I could activate to get more
information? After having a good look at the em/e1000 code, I can
understand how crazy this stuff gets. lspci claims that my device
is an Intel Corporation 82566MM Gigabit Network Connection
[8086:1049].

Thanks for all of your hard work,

Kevin
***@patheticgeek.net

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
SAITOH Masanobu
2010-03-09 04:56:05 UTC
Permalink
Hi, Kevin.


From: Kevin Lahey <***@patheticgeek.net>
Subject: Re: gigE negotiation failure with wm0
Date: Wed, 17 Feb 2010 15:07:40 -0800
Post by Kevin Lahey
On Wed, 16 Dec 2009 14:01:23 +0900 (JST)
Post by SAITOH Masanobu
I've commit the fix for igphy and re-enabled it. Could you
verify whether that bug have been fixed? I've verified that on
ICH8 and ICH9.
Sorry to take so long to test this!
Unfortunately, I still see issues. I suspend my T61 at home, on a
100Mb network, then resume at work, on a 1Gb network. I see high
packet loss rates and all sorts of weird difficulties until I
"ifconfig wm0 down; ifconfig wm0 up", after which everything is
fine.
If I use ukphy instead of igphy, things just work.
Is there some debugging code I could activate to get more
information? After having a good look at the em/e1000 code, I can
understand how crazy this stuff gets. lspci claims that my device
is an Intel Corporation 82566MM Gigabit Network Connection
[8086:1049].
Thanks for all of your hard work,
Kevin
I added three workaround for ICH8 with igphy into wm. Two workarounds
are enabled by default and another workaround is disabled. To enable
the last workaround, add

#define WM_WOL 1

into if_wm.c.

Could you try it?

----------------------------------------------------------
SAITOH Masanobu (***@iij.ad.jp
***@netbsd.org)

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Kevin Lahey
2010-03-10 22:42:10 UTC
Permalink
On Tue, 09 Mar 2010 13:56:05 +0900 (JST)
Post by SAITOH Masanobu
I added three workaround for ICH8 with igphy into wm. Two
workarounds are enabled by default and another workaround is
disabled. To enable the last workaround, add
#define WM_WOL 1
into if_wm.c.
Could you try it?
Running a kernel with WM_DEBUG defined, it looks like it works!
Thanks for you hard work, and sorry to not write you with test
results any sooner. I'll try it without WM_DEBUG defined tomorrow
just to verify.

Awesome!

Kevin
***@patheticgeek.net

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...