Sverre Froyen
2010-04-05 14:38:41 UTC
Hi,
I have noticed that the current iwn driver sometimes will lock up completely.
When this occurs, the error count (as reported by netstat -i) keeps
increasing and no packets are received.
Here is what appears(*) to happen (amd64 / current):
The driver is using rbufs to store received packets. It allocates one rbuf per
RX ring plus 32 extra. The extra buffers are used by iwn_rx_done as shown in
this code fragment:
rbuf = iwn_alloc_rbuf(sc);
/* Attach RX buffer to mbuf header. */
MEXTADD(m1, rbuf->vaddr, IWN_RBUF_SIZE, 0, iwn_free_rbuf,
rbuf);
m1->m_flags |= M_EXT_RW;
If there are available rbufs, iwn_alloc_rbuf returns one rbuf and decrements
the number-of-free-rbufs counter. Otherwise, it returns null. iwn_free_rbuf
returns the rbuf to the free list and increments the free counter. It is
called automatically by the network stack.
Monitoring the number-of-free-rbufs counter during network traffic, I find that
it normally stays at 32, occasionally dropping into the twenties. Sometimes,
however, the count will abruptly jump to zero. At this point, the free count
does not recover but remains at zero for a *long* time. The interface does not
receive any packets as long as the driver has no free rbufs. After about ten
minutes, I see a flurry of calls to iwn_free_rbuf and the free count returns to
32. At this point the interface is working properly again.
What to do about this?
Can the mbufs code be modified not to hold on to the rbufs for as long as it
does? (I do not know whether or not the received data sitting in the rbufs
have been transferred to the userland code yet, but it seems likely that it
would have.)
Perhaps simply increase the number of extra rbuf buffers? Presumably, that
would make the problem happen less frequently. Perhaps increase it dynamically
by allocating additional rbufs when the free count drops to zero.
Implement an MCLGETI like function, as done in OpenBSD, and drop the rbufs
implementation. I made a crude attempt at this with _MCLGET(m, mcl_cache,
size, how) but ended up with an early panic in another part of the kernel.
Look to the FreeBSD driver which uses yet another solution.
Comments?
Thanks,
Sverre
(*) I said "appears to happen" because I debugged this issue using a more
recent port of the iwn OpenBSD driver than what is in current. But, as the
current driver exhibits the same lockup symtoms and the rbuf code is the same,
I have confidence in my analysis.
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
I have noticed that the current iwn driver sometimes will lock up completely.
When this occurs, the error count (as reported by netstat -i) keeps
increasing and no packets are received.
Here is what appears(*) to happen (amd64 / current):
The driver is using rbufs to store received packets. It allocates one rbuf per
RX ring plus 32 extra. The extra buffers are used by iwn_rx_done as shown in
this code fragment:
rbuf = iwn_alloc_rbuf(sc);
/* Attach RX buffer to mbuf header. */
MEXTADD(m1, rbuf->vaddr, IWN_RBUF_SIZE, 0, iwn_free_rbuf,
rbuf);
m1->m_flags |= M_EXT_RW;
If there are available rbufs, iwn_alloc_rbuf returns one rbuf and decrements
the number-of-free-rbufs counter. Otherwise, it returns null. iwn_free_rbuf
returns the rbuf to the free list and increments the free counter. It is
called automatically by the network stack.
Monitoring the number-of-free-rbufs counter during network traffic, I find that
it normally stays at 32, occasionally dropping into the twenties. Sometimes,
however, the count will abruptly jump to zero. At this point, the free count
does not recover but remains at zero for a *long* time. The interface does not
receive any packets as long as the driver has no free rbufs. After about ten
minutes, I see a flurry of calls to iwn_free_rbuf and the free count returns to
32. At this point the interface is working properly again.
What to do about this?
Can the mbufs code be modified not to hold on to the rbufs for as long as it
does? (I do not know whether or not the received data sitting in the rbufs
have been transferred to the userland code yet, but it seems likely that it
would have.)
Perhaps simply increase the number of extra rbuf buffers? Presumably, that
would make the problem happen less frequently. Perhaps increase it dynamically
by allocating additional rbufs when the free count drops to zero.
Implement an MCLGETI like function, as done in OpenBSD, and drop the rbufs
implementation. I made a crude attempt at this with _MCLGET(m, mcl_cache,
size, how) but ended up with an early panic in another part of the kernel.
Look to the FreeBSD driver which uses yet another solution.
Comments?
Thanks,
Sverre
(*) I said "appears to happen" because I debugged this issue using a more
recent port of the iwn OpenBSD driver than what is in current. But, as the
current driver exhibits the same lockup symtoms and the rbuf code is the same,
I have confidence in my analysis.
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de