Discussion:
altq and multi-core processing
(too old to reply)
Beverly Schwartz
2013-03-01 19:57:39 UTC
Permalink
We've observed a series of unusual problems when we have enabled
mutli-core with two processors. We are using altq and pdf.

Perusing the altq code, I found this comment in altq_rmclass.c before
the function rmc_restart:

/*
* void
* rmc_restart() - is just a helper routine for rmc_delay_action -- it is
* called by the system timer code & is responsible checking if the
* class is still sleeping (it might have been restarted as a side
* effect of the queue scan on a packet arrival) and, if so, restarting
* output for the class. Inspecting the class state & restarting output
* require locking the class structure. In general the driver is
* responsible for locking but this is the only routine that is not
* called directly or indirectly from the interface driver so it has
* know about system locking conventions. Under bsd, locking is done
* by raising IPL to splnet so that's what's implemented here. On a
* different system this would probably need to be changed.
*
* Returns: NONE
*/

Note the last two sentences:
* Under bsd, locking is done
* by raising IPL to splnet so that's what's implemented here. On a
* different system this would probably need to be changed.

Looking through the altq code, there appears to be *no* locking done.
The only synchronization that appears to exist is calls to splnet.
While this will work on a single-processor system, I suspect some
of our unusual problems might be caused by contention in altq. I
haven't dug deep enough to say more than this.

Anybody have any experience with this?

-Bev

---
Beverly Schwartz
BBN Technologies
***@bbn.com

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Michael van Elst
2013-03-02 10:25:18 UTC
Permalink
Post by Beverly Schwartz
We've observed a series of unusual problems when we have enabled
mutli-core with two processors. We are using altq and pdf.
Looking through the altq code, there appears to be *no* locking done.
The only synchronization that appears to exist is calls to splnet.
altq is supposed to run under the giant lock where splXXX still works
even for multiple processors.


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...