Beverly Schwartz
2012-04-19 23:39:33 UTC
In bnx_rx_intr, there is a while loop:
while (sw_cons != hw_cons)
Inside this loop, we grab the next mbuf that's available.
m = sc->rx_mbuf_ptr[sw_chain_cons];
sc->rx_mbuf_ptr[sw_chain_cons] = NULL;
It then goes on and tries to get an mbuf cluster to replace the one we just took off the ring.
if (bnx_get_buf(sc, &sw_prod, &sw_chain_prod, &sw_prod_bseq))
If bnx_get_buf fails, the code then calls bnx_add_buf to put the mbuf we just received back on the ring.
bnx_add_buf(sc, m, &sw_prod, &sw_chain_prod, &sw_prod_bseq)
Inside bnx_add_buf, first we put the mbuf (the recycled m) at sw_chain_prod. Because this is the same as sw_chain_cons, m gets placed back at the point we just nulled out.
sc->rx_buf_ptr[*chain_prod] = m_new;
Then sw_chain_prod gets bumped up.
*prod = NEXT_RX_BD(*prod);
*chain_prod = RX_CHAIN_IDX(*prod);
When the code returns from bnx_add_buf, a "continue" is executed, thus going around the loop again.
sw_chain_cons has NOT been incremented, since that call to NEXT_RX_BD is further down in processing in the loop.
However, sw_chain_prod has been advanced in bnx_add_buf.
Next time around the loop, we do all of the above, but now sw_chain_prod is one greater than sw_chain_cons. Because we know we're out of mbuf clusters, bnx_get_buf will fail again, and we will recycle the mbuf once again. However, this time, it will be placed one place ahead of sw_chain_cons. Now we have lost an mbuf cluster forever (because there was one already at sc->rx_buf_ptr[*chain_prod] which will be overwritten), and things go downhill from there. Eventually we lose all mbuf clusters, and our interfaces no longer function at all.
Note, there is another condition in which we recycle mbufs, which suffers from the same problem.
I can think of four things we can do, but I'm not sure which is the "right" answer.
1. When we return from bnx_add_buf, restore sw_chain_prod to whatever it was before we called bnx_add_buf. This will probably cause an infinite loop, so probably isn't a great solution.
2. When we return from bnx_add_buf, push sw_cons along. This will cause all of the packets that the bnx driver has sucked in to be dropped.
3. Instead of calling continue, call break. This will leave the receive chain intact. It breaks out of the loop, but bnx_rx_intr will probably be called trying to process the same packet over and over.
4. Instead of calling continue, increment sw_cons and break. This will cause one packet to be dropped, will at least change conditions for the driver.
I shall try all of these options and see what happens...
-Bev
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
while (sw_cons != hw_cons)
Inside this loop, we grab the next mbuf that's available.
m = sc->rx_mbuf_ptr[sw_chain_cons];
sc->rx_mbuf_ptr[sw_chain_cons] = NULL;
It then goes on and tries to get an mbuf cluster to replace the one we just took off the ring.
if (bnx_get_buf(sc, &sw_prod, &sw_chain_prod, &sw_prod_bseq))
If bnx_get_buf fails, the code then calls bnx_add_buf to put the mbuf we just received back on the ring.
bnx_add_buf(sc, m, &sw_prod, &sw_chain_prod, &sw_prod_bseq)
Inside bnx_add_buf, first we put the mbuf (the recycled m) at sw_chain_prod. Because this is the same as sw_chain_cons, m gets placed back at the point we just nulled out.
sc->rx_buf_ptr[*chain_prod] = m_new;
Then sw_chain_prod gets bumped up.
*prod = NEXT_RX_BD(*prod);
*chain_prod = RX_CHAIN_IDX(*prod);
When the code returns from bnx_add_buf, a "continue" is executed, thus going around the loop again.
sw_chain_cons has NOT been incremented, since that call to NEXT_RX_BD is further down in processing in the loop.
However, sw_chain_prod has been advanced in bnx_add_buf.
Next time around the loop, we do all of the above, but now sw_chain_prod is one greater than sw_chain_cons. Because we know we're out of mbuf clusters, bnx_get_buf will fail again, and we will recycle the mbuf once again. However, this time, it will be placed one place ahead of sw_chain_cons. Now we have lost an mbuf cluster forever (because there was one already at sc->rx_buf_ptr[*chain_prod] which will be overwritten), and things go downhill from there. Eventually we lose all mbuf clusters, and our interfaces no longer function at all.
Note, there is another condition in which we recycle mbufs, which suffers from the same problem.
I can think of four things we can do, but I'm not sure which is the "right" answer.
1. When we return from bnx_add_buf, restore sw_chain_prod to whatever it was before we called bnx_add_buf. This will probably cause an infinite loop, so probably isn't a great solution.
2. When we return from bnx_add_buf, push sw_cons along. This will cause all of the packets that the bnx driver has sucked in to be dropped.
3. Instead of calling continue, call break. This will leave the receive chain intact. It breaks out of the loop, but bnx_rx_intr will probably be called trying to process the same packet over and over.
4. Instead of calling continue, increment sw_cons and break. This will cause one packet to be dropped, will at least change conditions for the driver.
I shall try all of these options and see what happens...
-Bev
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de