nfs kernel booting is slow

Discussion:

(too old to reply)

scole_mail

2017-11-19 23:03:54 UTC

Hello,

I've got an old powermac that I boot diskless. But when I tried to put
it on a newer and faster switch, the kernel takes forever to nfs load
and finally boot. Once the kernel has loaded, the ethernet works fine.

If I leave the the old switch in, it works fine.

Slow kernel load:

powermac <==> 10/100/1000Mbps
nfs root new switch

Works fine:
powermac <==> 10/100 Mbps <==> 10/100/1000Mbps
nfs root old switch new switch

I trying to ditch my old switch. Both switches supposedly support
10Base-T which is what my powermac has. I got some tcpdump and
wireshark captures for both cases, but I'm not sure what to look for.

Not Working:
08:32:03.148935 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 41 104 read fh 0,0/-293204188 1024 bytes @ 30720
08:32:03.148962 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read
08:32:05.202903 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 41 104 read fh 0,0/-293204188 1024 bytes @ 30720
08:32:05.202939 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read
08:32:05.281127 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 42 104 read fh 0,0/-293204188 1024 bytes @ 31744
08:32:05.281154 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read
08:32:07.335172 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 42 104 read fh 0,0/-293204188 1024 bytes @ 31744
08:32:07.335203 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read
08:32:07.415459 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 43 104 read fh 0,0/-293204188 1024 bytes @ 32768
08:32:07.415487 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read
08:32:09.469354 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 43 104 read fh 0,0/-293204188 1024 bytes @ 32768
08:32:09.469393 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read

Working:
08:26:21.535864 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 46 reply ok 1124 read
08:26:21.615987 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 47 104 read fh 0,0/-293204188 1024 bytes @ 36864
08:26:21.616018 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 47 reply ok 1124 read
08:26:23.670014 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 47 104 read fh 0,0/-293204188 1024 bytes @ 36864
08:26:23.670104 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 47 reply ok 1124 read
08:26:23.748238 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 48 104 read fh 0,0/-293204188 1024 bytes @ 37888
08:26:23.748313 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 48 reply ok 1124 read
08:26:25.802286 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 48 104 read fh 0,0/-293204188 1024 bytes @ 37888
08:26:25.802369 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 48 reply ok 1124 read
08:26:25.880508 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 49 104 read fh 0,0/-293204188 1024 bytes @ 38912
08:26:25.880570 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 49 reply ok 1124 read

Any suggestions what to look for or how to debug? Is there something
that would need to change in the bootloader or some nfs flag?

Thanks

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Paul Goyette

2017-11-19 23:46:21 UTC

Permalink

Post by scole_mail
Hello,
I've got an old powermac that I boot diskless. But when I tried to put
it on a newer and faster switch, the kernel takes forever to nfs load
and finally boot. Once the kernel has loaded, the ethernet works fine.
If I leave the the old switch in, it works fine.
<snip>
Any suggestions what to look for or how to debug? Is there something
that would need to change in the bootloader or some nfs flag?

I had a similar issue some months ago, and it turned out that the "slow"
condition was caused by a switch that didn't handle flow-control very
well. I didn't get into the details, but the problem occurred when I
had both 100M and 1000M/1G devices plugged in.

Changing the switch fixed the problem.

(That's what I got for trying to use the least-expensive swith I could
fine. You gets what you pays for.)

+------------------+--------------------------+----------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+------------------+--------------------------+----------------------------+

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Michael van Elst

2017-11-20 23:58:41 UTC

Permalink

***@gmx.com (scole_mail) writes:

Both seem to be slow, look at the time stamps and repeated requests (same xid).
Sure that one is a dump from a 'working' setup?

Post by scole_mail
08:32:03.148962 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read
08:32:05.202939 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read
08:32:05.281154 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read
08:32:07.335203 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read
08:32:07.415487 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read
08:32:09.469393 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read
08:26:21.535864 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 46 reply ok 1124 read
08:26:21.616018 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 47 reply ok 1124 read
08:26:23.670104 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 47 reply ok 1124 read
08:26:23.748313 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 48 reply ok 1124 read
08:26:25.802369 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 48 reply ok 1124 read
08:26:25.880570 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 49 reply ok 1124 read

--
--
Michael van Elst
Internet: ***@serpens.de
"A potential Snark may lurk in every tree."

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

scole_mail

2017-11-21 00:28:22 UTC

Permalink

Post by Michael van Elst
Both seem to be slow, look at the time stamps and repeated requests (same xid).
Sure that one is a dump from a 'working' setup?

Yes, that data looks incorrect.

He is another one that should actually be from the 'working' setup :

16:18:45.517755 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 40 104 read fh 0,0/-293204188 1024 bytes @ 29696
16:18:45.517779 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 40 reply ok 1124 read
16:18:45.519313 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 41 104 read fh 0,0/-293204188 1024 bytes @ 30720
16:18:45.519336 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read
16:18:45.520867 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 42 104 read fh 0,0/-293204188 1024 bytes @ 31744
16:18:45.520891 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read
16:18:45.525017 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 43 104 read fh 0,0/-293204188 1024 bytes @ 32768
16:18:45.525041 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read
16:18:45.526610 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 44 104 read fh 0,0/-293204188 1024 bytes @ 33792
16:18:45.526634 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 44 reply ok 1124 read
16:18:45.528167 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 45 104 read fh 0,0/-293204188 1024 bytes @ 34816
16:18:45.528191 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 45 reply ok 1124 read
16:18:45.529723 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 46 104 read fh 0,0/-293204188 1024 bytes @ 35840
16:18:45.529747 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 46 reply ok 1124 read
16:18:45.533818 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 47 104 read fh 0,0/-293204188 1024 bytes @ 36864
16:18:45.533841 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 47 reply ok 1124 read
16:18:45.535414 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 48 104 read fh 0,0/-293204188 1024 bytes @ 37888
16:18:45.535441 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 48 reply ok 1124 read
16:18:45.536972 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 49 104 read fh 0,0/-293204188 1024 bytes @ 38912
16:18:45.536996 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 49 reply ok 1124 read
16:18:45.538528 IP 10.0.0.3.1020 > 10.0.0.1.2049: NFS request xid 50 104 read fh 0,0/-293204188 1024 bytes @ 39936
16:18:45.538552 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 50 reply ok 1124 read

Thanks

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Lloyd Parkes

2017-11-21 00:16:53 UTC

Permalink

Post by Michael van Elst
Both seem to be slow, look at the time stamps and repeated requests (same xid).
Sure that one is a dump from a 'working' setup?

I agree with Michael that the dump from the “working” setup doesn’t look healthy at all.

I have seen similar problems relating to NFS and a lack of flow control before (as paul mentioned), but these read requests all fit inside a single ethernet frame, so you shouldn’t need any flow control. You are only ever waiting for one packet and you really shouldn’t need flow control.

One thing I have seen is horrifyingly slow network interfaces speaking 10base-T. I had to rate limit traffic to a SAN switch’s 10base-T management port because it simply couldn’t handle traffic coming in at 10Mb/s and firmware updates via FTP were timing out. Maybe the powermac is half-duplex (which is reasonable for 10base-T) and it can’t switch from transmit to receive fast enough to read the reply properly (which is me being super speculative). Try setting your NFS server’s network link to a slower speed, such as 10base-T just to see if that helps.

Cheers,
Lloyd

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Christos Zoulas

2017-11-21 17:45:02 UTC

Permalink

Post by Michael van Elst

Post by Michael van Elst
Both seem to be slow, look at the time stamps and repeated requests

(same xid).

Post by Michael van Elst
Sure that one is a dump from a 'working' setup?

I agree with Michael that the dump from the âworkingâ setup
doesnât look healthy at all.
I have seen similar problems relating to NFS and a lack of flow control
before (as paul mentioned), but these read requests all fit inside a
single ethernet frame, so you shouldnât need any flow control. You are
only ever waiting for one packet and you really shouldnât need flow
control.
One thing I have seen is horrifyingly slow network interfaces speaking
10base-T. I had to rate limit traffic to a SAN switchâs 10base-T
management port because it simply couldnât handle traffic coming in at
10Mb/s and firmware updates via FTP were timing out. Maybe the powermac
is half-duplex (which is reasonable for 10base-T) and it canât switch
from transmit to receive fast enough to read the reply properly (which
is me being super speculative). Try setting your NFS serverâs network
link to a slower speed, such as 10base-T just to see if that helps.

Try mounting with rsize=1024,wsize=1024?

christos

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

scole_mail

2017-11-21 22:10:46 UTC

Permalink

Post by Christos Zoulas

Post by Michael van Elst

Post by Michael van Elst
Both seem to be slow, look at the time stamps and repeated requests

(same xid).

Post by Michael van Elst
Sure that one is a dump from a 'working' setup?

I agree with Michael that the dump from the “working” setup
doesn’t look healthy at all.
I have seen similar problems relating to NFS and a lack of flow control
before (as paul mentioned), but these read requests all fit inside a
single ethernet frame, so you shouldn’t need any flow control. You are
only ever waiting for one packet and you really shouldn’t need flow
control.
One thing I have seen is horrifyingly slow network interfaces speaking
10base-T. I had to rate limit traffic to a SAN switch’s 10base-T
management port because it simply couldn’t handle traffic coming in at
10Mb/s and firmware updates via FTP were timing out. Maybe the powermac
is half-duplex (which is reasonable for 10base-T) and it can’t switch
from transmit to receive fast enough to read the reply properly (which
is me being super speculative). Try setting your NFS server’s network
link to a slower speed, such as 10base-T just to see if that helps.

Try mounting with rsize=1024,wsize=1024?
christos

I reposted a corrected "working" set. Unfortunately, these are all
unmanaged switches.

The rsize/wsize options didn't seem to make a difference either way.

Thanks for the suggestions so far.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Michael van Elst

2017-11-22 20:42:18 UTC

Permalink

Post by scole_mail
The rsize/wsize options didn't seem to make a difference either way.

Since kernel loading is the issue and not the kernel accessing the
root volume you have to look at the macppc bootloader. The bootloader
uses the NFS client in libsa which does UDP only and is already limited
to rsize=1024.

What you see in the bad trace is a NFS request that is quickly answered
by the NFS server, and then a pause of about two seconds. Then the same
NFS request is retried, answered as quickly but then followed by the next
request.

This means that the first response isn't seen by the client.

One possible reason for this is a duplex mismatch between client and switch.
The bootloader doesn't configure the interface but relies on OpenFirmware
to do this and handle autonegotiation correctly. Maybe there are properties
in OpenFirmware where you can configure the interface.

Another possibility is that the Gigabit switch doesn't buffer packets, then
it's possible that packets from 1Gbps to 10Mbps port are dropped when they
arrive too fast. For a test you could either try to throttle the NFS server
to 100Mbps or maybe the switch supports 802.3x style flow control and you
can enable this on the interface of the NFS server.

Lloyd Parkes

2017-11-22 18:05:21 UTC

Permalink

Post by scole_mail
Unfortunately, these are all
unmanaged switches

That is going to be a problem if the cause of the slow boots is rate related.

At this point I would recommend keeping the 10/100 switch, if only for your powermac. Apple’s OpenFirmware in the powermacs is notorious and the fact that you have a way get it to boot reliably means that you are already winning. If someone asks why you have a 10/100 switch as well as a 1Gb/s switch, you just need to point at the 10/100 switch and say “that’s the box that lets my powermac boot”.

I still have one or two 10Mb/s hubs for connecting to some of my own antiques.

Cheers,
Lloyd

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Lloyd Parkes

2017-11-21 18:04:07 UTC

Permalink

This mail got dinged by the mailing list HTML filter because Apple Mail switched the message format to HTML in the middle of the conversation.

Subject: Re: nfs kernel booting is slow
Date: 20 November 2017 at 2:50:15 PM GMT-10

Post by scole_mail
08:32:03.148962 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read

xid 41 failed, response time 27μs

Post by scole_mail
08:32:05.202939 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 41 reply ok 1124 read

xid 41 succeeded, response time 36μs

Post by scole_mail
08:32:05.281154 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read

xid 42 failed, response time 27μs

Post by scole_mail
08:32:07.335203 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 42 reply ok 1124 read

xid 42 succeeded, response time 31μs

Post by scole_mail
08:32:07.415487 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read

xid 43 failed, response time 28μs

Post by scole_mail
08:32:09.469393 IP 10.0.0.1.2049 > 10.0.0.3.1020: NFS reply xid 43 reply ok 1124 read

xid 43 succeeded?, response time 39μs
It’s a really small sample set, but the failures are all <= 28μs and the successes are all >= 31μs.
I’ve just seen your new working dump, which looks much better. Also, times in the working dump are shorter than the failed reply times in the not-working dump. The 100Mb/s link to the working switch will introduce a packet forwarding delay though, which might be helping.
Maybe it’s your new switch which doesn’t handle 10base-T very well? Does the powermac do 10base-T full-duplex? Maybe the switch port has a duplex mismatch with your powermac?
If it is a speed related issue, slowing down the NFS server’s network link should show this up.
Cheers,
Lloyd

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de