Discussion:
ipv6 TSO
(too old to reply)
YAMAMOTO Takashi
2006-11-17 11:51:37 UTC
Permalink
the attached patches implement ipv6 TSO for wm(4).
(partly from Matthias Scheler.)

please review and/or test.

YAMAMOTO Takashi
Matthias Scheler
2006-11-20 16:08:35 UTC
Permalink
Post by YAMAMOTO Takashi
the attached patches implement ipv6 TSO for wm(4).
(partly from Matthias Scheler.)
The patch for "mbuf.h" didn't apply cleanly. I've attached a modified
version which does.
Post by YAMAMOTO Takashi
please review and/or test.
It works fine for me on this network card ...

wm0 at pci3 dev 0 function 0: Intel i82573L Gigabit Ethernet, rev. 0
wm0: interrupting at ioapic0 pin 17 (irq 10)
wm0: PCI-Express bus
wm0: 256 word (8 address bits) SPI EEPROM
wm0: Ethernet address 00:15:f2:xx:xx:xx
makphy0 at wm0 phy 1: Marvell 88E1111 Gigabit PHY, rev. 2

... with these features enabled:

wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=6bf80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Tx,UDP6CSUM_Tx,TSO6>
address: 00:15:f2:xx:xx:xx
media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause)
status: active
[...]

I've uploaded a 900MB file FTP over TCPv6 (using a kernel producing some
extra debugging output when TSOv6 is used) and compared the SHA512 hashes
of the source and destination file.

Kind regards
--
Matthias Scheler http://zhadum.org.uk/
Hubert Feyrer
2006-11-20 23:52:56 UTC
Permalink
Post by Matthias Scheler
I've uploaded a 900MB file FTP over TCPv6 (using a kernel producing some
extra debugging output when TSOv6 is used) and compared the SHA512 hashes
of the source and destination file.
What was the time and cpu utilization difference (assuming that's the
point of TSO)?


- Hubert

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Matthias Scheler
2006-11-21 09:41:25 UTC
Permalink
Post by Hubert Feyrer
Post by Matthias Scheler
I've uploaded a 900MB file FTP over TCPv6 (using a kernel producing some
extra debugging output when TSOv6 is used) and compared the SHA512 hashes
of the source and destination file.
What was the time and cpu utilization difference (assuming that's the
point of TSO)?
I'm not sure how to measure that properly. As a quick test I used "rcp"
to transfer a 768MB file over IPv6 over Gigabit ethernet to "/dev/null"
on the remote machine:

With TSOv6:
0.05s user 4.13s system 20% cpu 20.189 total

Without TSOv6:
0.08s user 4.33s system 21% cpu 20.447 total

Second run with TSOv6:
0.08s user 4.33s system 21% cpu 20.447 total

Those numbers don't look conclusive to me.

Kind regards
--
Matthias Scheler http://zhadum.org.uk/

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
YAMAMOTO Takashi
2006-11-21 10:10:12 UTC
Permalink
Post by Matthias Scheler
Post by Hubert Feyrer
Post by Matthias Scheler
I've uploaded a 900MB file FTP over TCPv6 (using a kernel producing some
extra debugging output when TSOv6 is used) and compared the SHA512 hashes
of the source and destination file.
What was the time and cpu utilization difference (assuming that's the
point of TSO)?
I'm not sure how to measure that properly. As a quick test I used "rcp"
to transfer a 768MB file over IPv6 over Gigabit ethernet to "/dev/null"
0.05s user 4.13s system 20% cpu 20.189 total
0.08s user 4.33s system 21% cpu 20.447 total
0.08s user 4.33s system 21% cpu 20.447 total
Those numbers don't look conclusive to me.
can you try netperf -t TCPIPV6_STREAM?

YAMAMOTO Takashi

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven M. Bellovin
2006-11-21 12:50:28 UTC
Permalink
Post by Matthias Scheler
Post by Hubert Feyrer
Post by Matthias Scheler
I've uploaded a 900MB file FTP over TCPv6 (using a kernel producing some
extra debugging output when TSOv6 is used) and compared the SHA512 hashes
of the source and destination file.
What was the time and cpu utilization difference (assuming that's the
point of TSO)?
I'm not sure how to measure that properly. As a quick test I used "rcp"
to transfer a 768MB file over IPv6 over Gigabit ethernet to "/dev/null"
Hmm -- try running some other CPU-bound process in the background. When
the file transfer is finished, kill it and see how much CPU time it got
(or, perhaps, how many times through some loop it got).

Rationale: checksum time is kernel time that's probably taken ultimately
from the idle time. Your application used 20% of the CPU, so maybe it
would show as 20% of the checksum time being billed to you, but that's not
clear -- in the steady state, the application's write() calls are going to
be correlated with when checksum calculations are done.

--Steven M. Bellovin, http://www.cs.columbia.edu/~smb

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Matthias Scheler
2006-11-21 13:12:59 UTC
Permalink
Post by YAMAMOTO Takashi
Post by Matthias Scheler
Those numbers don't look conclusive to me.
can you try netperf -t TCPIPV6_STREAM?
Yes, of course:

With TSOv6:
0.04s user 1.05s system 10% cpu 10.087 total
0.08s user 1.30s system 13% cpu 10.023 total

Without TSOv6:
0.02s user 1.69s system 17% cpu 10.014 total
0.03s user 1.59s system 16% cpu 10.061 total

That looks more conclusive.

Kind regards
--
Matthias Scheler http://zhadum.org.uk/

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
YAMAMOTO Takashi
2006-11-21 13:26:23 UTC
Permalink
Post by Matthias Scheler
Post by YAMAMOTO Takashi
Post by Matthias Scheler
Those numbers don't look conclusive to me.
can you try netperf -t TCPIPV6_STREAM?
0.04s user 1.05s system 10% cpu 10.087 total
0.08s user 1.30s system 13% cpu 10.023 total
0.02s user 1.69s system 17% cpu 10.014 total
0.03s user 1.59s system 16% cpu 10.061 total
That looks more conclusive.
how many bits/s did they show?

YAMAMOTO Takashi

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Matthias Scheler
2006-11-21 13:43:45 UTC
Permalink
Post by YAMAMOTO Takashi
Post by Matthias Scheler
0.04s user 1.05s system 10% cpu 10.087 total
0.08s user 1.30s system 13% cpu 10.023 total
0.02s user 1.69s system 17% cpu 10.014 total
0.03s user 1.59s system 16% cpu 10.061 total
That looks more conclusive.
how many bits/s did they show?
With TSOv6:

TCPIPV6 STREAM TEST to ... : histogram
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

32768 32768 32768 10.01 806.06

0.04s user 0.97s system 9% cpu 10.117 total

Without TSOv6:
TCPIPV6 STREAM TEST to ... : histogram
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

32768 32768 32768 10.01 710.82
0.02s user 1.69s system 17% cpu 10.015 total

Kind regards
--
Matthias Scheler http://zhadum.org.uk/

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Hubert Feyrer
2006-11-21 14:01:54 UTC
Permalink
Post by Matthias Scheler
32768 32768 32768 10.01 806.06
32768 32768 32768 10.01 710.82
So it's 710MB/s -> 806MB/s - that sounds pretty good!

Do you have comparable numbers (with/without TSO) for IPv4 too?


- Hubert

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Matthias Scheler
2006-11-21 14:07:38 UTC
Permalink
Post by Hubert Feyrer
Post by Matthias Scheler
32768 32768 32768 10.01 806.06
32768 32768 32768 10.01 710.82
So it's 710MB/s -> 806MB/s - that sounds pretty good!
Using less CPU power!
Post by Hubert Feyrer
Do you have comparable numbers (with/without TSO) for IPv4 too?
Here they are:

With TSOv4:
TCP STREAM TEST to ... : histogram
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

32768 32768 32768 10.06 713.35
0.03s user 1.08s system 11% cpu 10.071 total

Without TSOv4:
TCP STREAM TEST to ... : histogram
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

32768 32768 32768 10.06 624.60
0.01s user 1.24s system 12% cpu 10.061 total

Kind regards
--
Matthias Scheler http://zhadum.org.uk/

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...