Discussion:
atrocious tx performance of gigabit cardbus re(4)
(too old to reply)
Jonathan A. Kollasch
2007-09-24 16:24:01 UTC
Permalink
Hi,

I've got a Netgear GA511 (Gigabit CardBus re(4)). I recently got
my first real gigabit peer (other than the gigabit ports on
my 10/100+1000 switch) anyway, when I have the nfe(4) send
a TCP stream at the re(4), I can get a reasonable (for some
acceptable value thereof) 30 Mbytes/s. However, when I send
from the re(4) to the nfe(4) I can only get 4Mbytes/s.
From 100Mbps sources on the same switch I can get full
Fast Ethernet performance to the nfe(4), so it's not
that end's problem.

Tests were performed using `nc6 -x` and progress(1).

The cbb(4)s I'm using are TI, a product 0x8031 rev 0x0 and
a product 0x8039 rev 0x00 on NetBSD/i386 laptops running
4.0 BETA2 and 4.99.22.

ISTR dyoung@ had performance problems with TI CardBus bridges
before, but I'm not sure if those issues are also effecting
me.

Debugging tips would be appreciated.

Jonathan Kollasch
David Young
2007-09-24 16:35:50 UTC
Permalink
Post by Jonathan A. Kollasch
Hi,
I've got a Netgear GA511 (Gigabit CardBus re(4)). I recently got
my first real gigabit peer (other than the gigabit ports on
my 10/100+1000 switch) anyway, when I have the nfe(4) send
a TCP stream at the re(4), I can get a reasonable (for some
acceptable value thereof) 30 Mbytes/s. However, when I send
from the re(4) to the nfe(4) I can only get 4Mbytes/s.
From 100Mbps sources on the same switch I can get full
Fast Ethernet performance to the nfe(4), so it's not
that end's problem.
Tests were performed using `nc6 -x` and progress(1).
The cbb(4)s I'm using are TI, a product 0x8031 rev 0x0 and
a product 0x8039 rev 0x00 on NetBSD/i386 laptops running
4.0 BETA2 and 4.99.22.
before, but I'm not sure if those issues are also effecting
me.
Hi Jonathan,

I have some patches that may help. The problem is that NetBSD does not
enable read bursts on the PCI side of the bridge, so the bridge does
single-cycle transactions on the NIC's behalf. What does dmesg say
about your cbb(4) ?

Dave
--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Jonathan A. Kollasch
2007-09-25 19:22:23 UTC
Permalink
Post by David Young
Post by Jonathan A. Kollasch
Hi,
I've got a Netgear GA511 (Gigabit CardBus re(4)). I recently got
.......
me.
Hi Jonathan,
I have some patches that may help. The problem is that NetBSD does not
enable read bursts on the PCI side of the bridge, so the bridge does
single-cycle transactions on the NIC's behalf. What does dmesg say
about your cbb(4) ?
Nothing unusual AFAICT:

The 4.99.22 on a Toshiba A135-S4527:

cbb0 at pci4 dev 4 function 0: Texas Instruments product 0x8039 (rev. 0x00)
... (other functions of the chip: 1394, media card reader, sdhci)
cbb0: interrupting at ioapic0 pin 16 (irq 255)
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 5
pcmcia0 at cardslot0

The other box (Compaq M2005US) is essentially the same,
but the product ID is recognized. ("PCI7x21/7x11 ...")

The actual chip is marked PCI7411 in the Compaq,
I've not opened the Toshiba.

<long pause>

I took the liberty of testing your patch from
http://mail-index.netbsd.org/current-users/2007/08/10/0004.html
(well, not the patch itself, just unconditionally setting the bits).
TX performance on the Compaq (1.3GHz Celeron Dothan) increased to
about 16Mbytes/s. RX increased some too.

Jonathan Kollasch
David Young
2007-09-25 19:32:32 UTC
Permalink
Post by Jonathan A. Kollasch
Post by David Young
Post by Jonathan A. Kollasch
Hi,
I've got a Netgear GA511 (Gigabit CardBus re(4)). I recently got
.......
me.
Hi Jonathan,
I have some patches that may help. The problem is that NetBSD does not
enable read bursts on the PCI side of the bridge, so the bridge does
single-cycle transactions on the NIC's behalf. What does dmesg say
about your cbb(4) ?
cbb0 at pci4 dev 4 function 0: Texas Instruments product 0x8039 (rev. 0x00)
... (other functions of the chip: 1394, media card reader, sdhci)
cbb0: interrupting at ioapic0 pin 16 (irq 255)
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 5
pcmcia0 at cardslot0
The other box (Compaq M2005US) is essentially the same,
but the product ID is recognized. ("PCI7x21/7x11 ...")
The actual chip is marked PCI7411 in the Compaq,
I've not opened the Toshiba.
<long pause>
I took the liberty of testing your patch from
http://mail-index.netbsd.org/current-users/2007/08/10/0004.html
(well, not the patch itself, just unconditionally setting the bits).
TX performance on the Compaq (1.3GHz Celeron Dothan) increased to
about 16Mbytes/s. RX increased some too.
That's good news!

It may be possible to get even higher performance by tweaking parameters
both on the bridge and on the NIC. For example, the latency timer on the
primary (PCI) bus may be rather small. On the NIC, both the PCI latency
timer and the DMA burst parameters (highly NIC-specific) may be too small.

Dave
--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2007-09-25 19:58:09 UTC
Permalink
On Tue, 25 Sep 2007 14:32:32 -0500
David Young <***@pobox.com> wrote:

...
Post by David Young
It may be possible to get even higher performance by tweaking
parameters both on the bridge and on the NIC. For example, the
latency timer on the primary (PCI) bus may be rather small. On the
NIC, both the PCI latency timer and the DMA burst parameters (highly
NIC-specific) may be too small.
I'm trying to find my notes, but if memory serves correctly I had
serious performance issues with a PCI re(4) on a gigE network.


--Steve Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2007-09-25 20:29:30 UTC
Permalink
On Tue, 25 Sep 2007 13:00:23 -0700
Post by Steven M. Bellovin
On Tue, 25 Sep 2007 14:32:32 -0500
...
Post by David Young
It may be possible to get even higher performance by tweaking
parameters both on the bridge and on the NIC. For example, the
latency timer on the primary (PCI) bus may be rather small. On the
NIC, both the PCI latency timer and the DMA burst parameters
(highly NIC-specific) may be too small.
I'm trying to find my notes, but if memory serves correctly I had
serious performance issues with a PCI re(4) on a gigE it.
[The driver and cardbus frontend are quite likely poorly
tuned or untuned.
If dim memory serves, I split the re driver and added the cardbus
frontend onne weekend, while the dongle on my old 3com cardbus card
was dying. The replacement card I had picked up at Fry's was one of
the first re(4) cardbus cards to hit the market. I did no more than
get the darn thing to work at 100Mbit and get the link LEDs working
and transmit of *some* packets working at gigabit speed.
I found an old private email I sent on my tests. While I was certainly
seeing better performance than the OP reported here, the re(4) card was
much worse than wm on a gigE network talking to another NetBSD machine
that had a wm card.


-----
Here are four ttcp tests. The first two were with the re card; the
second two, in the same slot on the same machine, same cable, etc.,
were with a wm card.

b139$ ttcp -s -fk -r
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp
ttcp-r: socket
ttcp-r: accept from 192.168.2.199
ttcp-r: 167772160 bytes in 3.66 real seconds = 357701.53 Kbit/sec +++
ttcp-r: 25991 I/O calls, msec/call = 0.14, calls/sec = 7093.06
ttcp-r: 0.0user 0.4sys 0:03real 11% 0i+0d 0maxrss 0+2pf 24920+39csw
b140$ ttcp -n 20480 -t -fk -s fubar
ttcp-t: buflen=8192, nbuf=20480, align=16384/0, port=5001 tcp -> fubar
ttcp-t: socket
ttcp-t: connect
ttcp-t: 167772160 bytes in 3.41 real seconds = 383868.23 Kbit/sec +++
ttcp-t: 20480 I/O calls, msec/call = 0.17, calls/sec = 5997.94
ttcp-t: 0.0user 0.4sys 0:03real 12% 0i+0d 0maxrss 0+40961pf 18890+35csw




b142$ ttcp -n 20480 -t -fk -s fubar
ttcp-t: buflen=8192, nbuf=20480, align=16384/0, port=5001 tcp -> fubar
ttcp-t: socket
ttcp-t: connect
ttcp-t: 167772160 bytes in 2.48 real seconds = 529463.40 Kbit/sec +++
ttcp-t: 20480 I/O calls, msec/call = 0.12, calls/sec = 8272.87
ttcp-t: 0.0user 0.5sys 0:02real 22% 0i+0d 0maxrss 0+40961pf 17057+46csw
b143$ ttcp -s -fk -r
ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001 tcp
ttcp-r: socket
ttcp-r: accept from 192.168.2.199
ttcp-r: 167772160 bytes in 2.50 real seconds = 524481.43 Kbit/sec +++
ttcp-r: 33499 I/O calls, msec/call = 0.08, calls/sec = 13404.54
ttcp-r: 0.0user 0.3sys 0:02real 15% 0i+0d 0maxrss 0+2pf 16841+46csw




--Steve Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
j***@dsg.stanford.edu
2007-09-25 20:00:23 UTC
Permalink
Post by Steven M. Bellovin
On Tue, 25 Sep 2007 14:32:32 -0500
...
Post by David Young
It may be possible to get even higher performance by tweaking
parameters both on the bridge and on the NIC. For example, the
latency timer on the primary (PCI) bus may be rather small. On the
NIC, both the PCI latency timer and the DMA burst parameters (highly
NIC-specific) may be too small.
I'm trying to find my notes, but if memory serves correctly I had
serious performance issues with a PCI re(4) on a gigE it.
[The driver and cardbus frontend are quite likely poorly
tuned or untuned.

If dim memory serves, I split the re driver and added the cardbus
frontend onne weekend, while the dongle on my old 3com cardbus card
was dying. The replacement card I had picked up at Fry's was one of
the first re(4) cardbus cards to hit the market. I did no more than
get the darn thing to work at 100Mbit and get the link LEDs working
and transmit of *some* packets working at gigabit speed.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...