Discussion:
rbuf starvation in the iwn driver
(too old to reply)
Sverre Froyen
2010-04-05 14:38:41 UTC
Permalink
Hi,

I have noticed that the current iwn driver sometimes will lock up completely.
When this occurs, the error count (as reported by netstat -i) keeps
increasing and no packets are received.

Here is what appears(*) to happen (amd64 / current):

The driver is using rbufs to store received packets. It allocates one rbuf per
RX ring plus 32 extra. The extra buffers are used by iwn_rx_done as shown in
this code fragment:

rbuf = iwn_alloc_rbuf(sc);
/* Attach RX buffer to mbuf header. */
MEXTADD(m1, rbuf->vaddr, IWN_RBUF_SIZE, 0, iwn_free_rbuf,
rbuf);
m1->m_flags |= M_EXT_RW;

If there are available rbufs, iwn_alloc_rbuf returns one rbuf and decrements
the number-of-free-rbufs counter. Otherwise, it returns null. iwn_free_rbuf
returns the rbuf to the free list and increments the free counter. It is
called automatically by the network stack.

Monitoring the number-of-free-rbufs counter during network traffic, I find that
it normally stays at 32, occasionally dropping into the twenties. Sometimes,
however, the count will abruptly jump to zero. At this point, the free count
does not recover but remains at zero for a *long* time. The interface does not
receive any packets as long as the driver has no free rbufs. After about ten
minutes, I see a flurry of calls to iwn_free_rbuf and the free count returns to
32. At this point the interface is working properly again.

What to do about this?

Can the mbufs code be modified not to hold on to the rbufs for as long as it
does? (I do not know whether or not the received data sitting in the rbufs
have been transferred to the userland code yet, but it seems likely that it
would have.)

Perhaps simply increase the number of extra rbuf buffers? Presumably, that
would make the problem happen less frequently. Perhaps increase it dynamically
by allocating additional rbufs when the free count drops to zero.

Implement an MCLGETI like function, as done in OpenBSD, and drop the rbufs
implementation. I made a crude attempt at this with _MCLGET(m, mcl_cache,
size, how) but ended up with an early panic in another part of the kernel.

Look to the FreeBSD driver which uses yet another solution.

Comments?

Thanks,
Sverre

(*) I said "appears to happen" because I debugged this issue using a more
recent port of the iwn OpenBSD driver than what is in current. But, as the
current driver exhibits the same lockup symtoms and the rbuf code is the same,
I have confidence in my analysis.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Sverre Froyen
2010-04-06 01:45:35 UTC
Permalink
Post by Sverre Froyen
Monitoring the number-of-free-rbufs counter during network traffic, I
find that it normally stays at 32, occasionally dropping into the
twenties. Sometimes, however, the count will abruptly jump to zero. At
this point, the free count does not recover but remains at zero for a
*long* time. The interface does not receive any packets as long as the
driver has no free rbufs. After about ten minutes, I see a flurry of
calls to iwn_free_rbuf and the free count returns to 32. At this point
the interface is working properly again.
During the flurry of calls to iwn_free_rbuf(), can you get a backtrace
in iwn_free_rbuf()? I hope that will show us what mechanism frees them
in a flurry.
Here is a trace from the first call to iwn_free_rbuf after the interface has
been locked up for ~10 mins.

iwn_free_rbuf
m_freem
tcp_freeq
tcp_close
tcp_timer_rexmt
callout_softclock
softint_dispatch
DDB lost frame for netbsd:Xsoftintr
Xsoftintr

This looks like some type of timeout. For what it's worth, I had quit the
applications that I was using to trigger the lock-up long before the call to
iwn_free_rbuf (although there were additional programs with network
connections open at the time of the call, ntpd, apache and openvpn come to
mind).

In the process of collecting the above trace, I added a call to panic if
iwn_free_rbuf was called with free buffer count of zero. It turns out this
happens rather quickly (long before the interface locks up). Here is the trace
from a non-locked-up call:

iwn_free_rbuf
soreceive
do_sys_recvmsg
sys_recvfrom
syscall

Sverre
This sounds similar to a bug that was fixed in OpenBSD ~3 years
http://www.openbsd.org/cgi-
bin/cvsweb/src/sys/dev/pci/if_wpi.c?rev=1.51;content-type=text%2Fx-cvsweb-
markup
http://www.openbsd.org/cgi-
bin/cvsweb/src/sys/dev/pci/if_wpi.c.diff?r1=1.50;r2=1.51;f=h
You should look at NetBSD's wpi(4) as it seems to have this issue
fixed too (using m_dup). I have no idea why it has not been
backported to NetBSD's iwn(4) though.
It looks like the NetBSD wpi changes he refers to must be these:

revision 1.10
date: 2007/06/18 19:40:49; author: degroote; state: Exp; lines: +37 -19
Add a workaround in the case where we have low number of rbuf.
It seems to fix problem of frozen network with wpi.

Looking through the if_iwn.c revisions it looks like the iwn driver had the
m_dup code until rev 1.33 (when iwn_rx_intr was replaced by iwn_rx_done). I'll
see if I can reintegrate the code.

PPS I used panic(9) to get the traces. Is it safe to "continue" from such a
diagnostic panic?

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Sverre Froyen
2010-04-07 17:30:24 UTC
Permalink
On Tue April 6 2010 06:26:10 Manuel Bouyer wrote:
...
What is doing this driver looks somewhat wrong to me.
It can use its rbufs as external mbuf storage, and pass this to the network
stack, as long as there's enough free rbufs. When the number of free rbufs
is below the limit, it should stop giving them to the network stack,
and copy the received data to a ordinary mbuf cluster instead.
I've been looking at the iwn_rx_done code to see how I can fix this. One option
is to use the code from iwn_rx_intr in if_iwn.c rev 1.32 (I presume I can
figure out the needed bus_dmamap changes by looking at if_nfe.c).

However, it looks like the current OpenBSD and FreeBSD drivers do not use
private storage. FreeBSD just gets an mbuf cluster from m_getjcl and OpenBSD
uses MCLGETI. Is this an option for NetBSD? Why was private storage used in
the first place? For speed reasons? Damien Bergamini warned me there would
likely be alignment issues.

Thanks,
Sverre

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Manuel Bouyer
2010-04-07 17:57:26 UTC
Permalink
Post by Sverre Froyen
I've been looking at the iwn_rx_done code to see how I can fix this. One option
is to use the code from iwn_rx_intr in if_iwn.c rev 1.32 (I presume I can
figure out the needed bus_dmamap changes by looking at if_nfe.c).
However, it looks like the current OpenBSD and FreeBSD drivers do not use
private storage. FreeBSD just gets an mbuf cluster from m_getjcl and OpenBSD
uses MCLGETI. Is this an option for NetBSD? Why was private storage used in
the first place? For speed reasons?
No idea.
Post by Sverre Froyen
Damien Bergamini warned me there would
likely be alignment issues.
If you use mbuf clusters, the storage will be aligned to 2k (size of
a mbuf cluster storage). I don't think it can be an issue.
--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Sverre Froyen
2010-04-07 22:38:37 UTC
Permalink
#define MEXTMALLOC(m, size, how)
do {
(m)->m_ext_storage.ext_buf =
(void *)malloc((size), mbtypes[(m)->m_type], (how));
...
with a single malloc looks like it would be contiguous but probably slow.
It should also be properly aligned according to the malloc(9) man page.
Doesn't this memory need to be dma-safe?
It does. I assume it needs to be contiguous and properly aligned, which
malloc(9) seems to imply. Does it also need to be wired? I notice that the
malloc man page says it's deprecated and to use pool_cache(9) or kmem(9)
instead. kmem provides wired memory.

A quick test of using MEXTMALLOC in the driver appears to work just fine. In
fact, the throughput appears to improved and the hard lockups are gone (at
least in my initial tests).

Regards,
Sverre

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Thor Lancelot Simon
2010-04-07 23:18:35 UTC
Permalink
Post by Sverre Froyen
Doesn't this memory need to be dma-safe?
It does. I assume it needs to be contiguous and properly aligned, which
malloc(9) seems to imply. Does it also need to be wired? I notice that the
malloc man page says it's deprecated and to use pool_cache(9) or kmem(9)
instead. kmem provides wired memory.
If you want memory you can DMA to/from, I think you need to use
bus_dmamem_alloc().

Thor

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Manuel Bouyer
2010-04-08 12:07:30 UTC
Permalink
Post by Thor Lancelot Simon
If you want memory you can DMA to/from, I think you need to use
bus_dmamem_alloc().
Yes, reading the man page for bus_dmamap_load etc. I see that is required. Is
this a requirement for NetBSD but not for OpenBSD? Look at this code fragment
it's not required in NetBSD if the proper bus_dma operations are used
on the memory, but it can improve performances.
bus_dmamem_alloc() return dma-safe memory, in the sense that it should
not require e.g. bounce buffers.
If you're doing the dma to arbitrary memory, the bus_dma call will check if
it's dma_safe and if it's not, will allocate bounce buffers and
copy data from/to it.
--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Sverre Froyen
2010-04-09 13:58:55 UTC
Permalink
Hi,

Attached is a minimal patch that fixes the iwn lock-ups (at least for me :-).
Please test.

Regards,
Sverre
matthew green
2010-04-10 02:49:48 UTC
Permalink
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.

thanks.


.mrg.

Index: if_iwn.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/if_iwn.c,v
retrieving revision 1.38
diff -p -r1.38 if_iwn.c
*** if_iwn.c 5 Apr 2010 07:20:26 -0000 1.38
--- if_iwn.c 10 Apr 2010 02:48:10 -0000
*************** iwn_rx_done(struct iwn_softc *sc, struct
*** 1939,1955 ****
ifp->if_ierrors++;
return;
}
! if (sc->rxq.nb_free_entries <= 0) {
! ic->ic_stats.is_rx_nobuf++;
! ifp->if_ierrors++;
! m_freem(m1);
! return;
}
! rbuf = iwn_alloc_rbuf(sc);
! /* Attach RX buffer to mbuf header. */
! MEXTADD(m1, rbuf->vaddr, IWN_RBUF_SIZE, 0, iwn_free_rbuf,
! rbuf);
! m1->m_flags |= M_EXT_RW;
bus_dmamap_unload(sc->sc_dmat, data->map);

error = bus_dmamap_load(sc->sc_dmat, data->map, m1->m_ext.ext_buf,
--- 1939,1959 ----
ifp->if_ierrors++;
return;
}
! if ((rbuf = iwn_alloc_rbuf(sc)) == NULL) {
! MEXTMALLOC(m1, IWN_RBUF_SIZE, M_DONTWAIT);
! if (!(m1->m_flags & M_EXT)) {
! ic->ic_stats.is_rx_nobuf++;
! ifp->if_ierrors++;
! m_freem(m1);
! return;
! }
! } else {
! /* Attach RX buffer to mbuf header. */
! MEXTADD(m1, rbuf->vaddr, IWN_RBUF_SIZE, 0, iwn_free_rbuf,
! rbuf);
! m1->m_flags |= M_EXT_RW;
}
!
bus_dmamap_unload(sc->sc_dmat, data->map);

error = bus_dmamap_load(sc->sc_dmat, data->map, m1->m_ext.ext_buf,

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Jeremy C. Reed
2010-04-10 03:23:54 UTC
Permalink
Post by matthew green
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.
*** if_iwn.c 5 Apr 2010 07:20:26 -0000 1.38
--- if_iwn.c 10 Apr 2010 02:48:10 -0000
That also appears to work for me. I did some many file transfers and it
seemed to work fine, but I have only been using it for 13 minutes. I
used Sverre's patch for several hours successfully.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
matthew green
2010-04-10 04:21:27 UTC
Permalink
Post by matthew green
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.
*** if_iwn.c 5 Apr 2010 07:20:26 -0000 1.38
--- if_iwn.c 10 Apr 2010 02:48:10 -0000
That also appears to work for me. I did some many file transfers and it
seemed to work fine, but I have only been using it for 13 minutes. I
used Sverre's patch for several hours successfully.

thanks for reporting.

my wireless is usually pretty flaky at home anyway, but it seems to
be working much better with this patch... i was expecting to have
switched back to the wire by now, but it's been hours without a
single dropout and a bunch of transfers.

sverre: thanks for the initial patch!


.mrg.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Young
2010-04-11 18:40:45 UTC
Permalink
Post by matthew green
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.
Is there any reason not to replace the implementation of
iwn_alloc_rbuf() with MEXTMALLOC(), instead?

Previously, a remote host could exhaust an iwn(4)'s private Rx buffer
pool by, say, sending a ton of tiny segments to TCP sockets that already
have a long backlog. Can a remote host now deplete the malloc(9) pool
in the same way?

Dave
--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
matthew green
2010-04-12 06:31:48 UTC
Permalink
Post by matthew green
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.
Is there any reason not to replace the implementation of
iwn_alloc_rbuf() with MEXTMALLOC(), instead?

as in hide this inside iwn_alloc_rbuf()? probably. or do you
mean give up the private buffer collection? in that case, i
am not sure what the right solution would be, seems like a
concept we want to decide for (all?) network drivers...

Previously, a remote host could exhaust an iwn(4)'s private Rx buffer
pool by, say, sending a ton of tiny segments to TCP sockets that already
have a long backlog. Can a remote host now deplete the malloc(9) pool
in the same way?

i don't know. however, i'd trade that potential DoS for the
actual much better working driver now... :)

still, this is something worth investigating (but not enough
for me to look at any time soon.)


.mrg.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Sverre Froyen
2010-04-12 15:57:46 UTC
Permalink
Post by David Young
Post by matthew green
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.
Is there any reason not to replace the implementation of
iwn_alloc_rbuf() with MEXTMALLOC(), instead?
as in hide this inside iwn_alloc_rbuf()? probably. or do you
mean give up the private buffer collection? in that case, i
am not sure what the right solution would be, seems like a
concept we want to decide for (all?) network drivers...
I suggest we deviate as little as possible from the original driver (OpenBSD's
iwn driver in this case). OpenBSD uses a single MCLGETI call and no private
storage. If necessary we use an adapter function to graft on the rbuf
implementation.

Regards,
Sverre

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Jeremy C. Reed
2010-04-09 16:19:45 UTC
Permalink
Post by Sverre Froyen
Attached is a minimal patch that fixes the iwn lock-ups (at least for
me :-). Please test.
Appears to work for me. Before my backups would hang my networking
temporarily. But now my networking appears to work and my backups
complete! Thank you!

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Stephen Borrill
2010-04-09 16:52:36 UTC
Permalink
Post by Sverre Froyen
Attached is a minimal patch that fixes the iwn lock-ups (at least for me :-).
Slightly off-topic:

Was there any progress backporting the 5100/5300 support to netbsd-5?
--
Stephen


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
matthew green
2010-04-11 02:24:56 UTC
Permalink
Post by matthew green
would you try this variant? it handles the case where MEXTMALLOC()
would fail to allocate external storage, and seems to work on my
iwn(4). i'll commit this if you confirm it works.
I can confirm that the patch works. Thanks for fixing my broken error path!


commited! thanks for the original patch. i only fixed a
problem not seen in action :)


.mrg.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...