Discussion:
network locking strategy questions
(too old to reply)
Beverly Schwartz
2013-03-12 01:58:42 UTC
Permalink
I have been trying to unravel the locking strategy for the input
and output paths. The input path looks clear - it uses
softnet_lock for all processing. The output path does not
seem to have a consistent locking strategy, and I am wondering
if there is some data contention along the way.

softnet_lock appears to be protecting all types of pcb's. Not sure
what other data is included in the softnet_lock strategy.
KERNEL_LOCK appears to protect output queues, and from moving data
to and from non-MP_SAFE driver queues.

I am asking, because with FAST_IPSEC and multi-core enabled, there
is a not-so-slow leak of MCL clusters. Running single-core, there
is no leak. So I am wondering if there is something in the locking
strategy that's coming up short. (The leak occurs when there is
an ESP tunnel in place, and data is going through the tunnel.
Tunnel establishment does not seem to be the cause, only moving
packets.) Note that we are also using pf and altq. If there's
no active IPsec ESP tunnel, there is no leak. Also note, that
the leak does not happen if the kernel is compiled with KAME_IPSEC
rather than FAST_IPSEC.



Input path:
Every packet input path looks something like this:

mutex_enter(softnet_lock);
KERNEL_LOCK(1, NULL);
take packets off of input queue
KERNEL_UNLOCK_ONE(NULL);
process list of packets taken from input queue
mutex_exit(softnet_lock);



Output path:
If a packet is being forwarded, then softnet_lock is being held by the
input path processing. However, if the host is the source of the packet,
softnet_lock is not taken. Since we're not looking up pcb's, this is
probably not a problem. In IPv4, before the packet is queued, KERNEL_LOCK
is taken. In IPv6, I cannot find where KERNEL_LOCK is taken, but that
doesn't mean it doesn't exist. I just can't find it.

v4 code:
KERNEL_LOCK(1, NULL);
error =(*ifp->if_output)(ifp, m,
(m->m_flags & M_MCAST) ? sintocsa(rdst) : sintocsa(dst),
rt);
KERNEL_UNLOCK_ONE(NULL);

v6 code:
return (*ifp->if_output)(ifp, m, sin6tocsa(dst), rt);

Given that altq DEPENDS on having KERNEL_LOCK, how can altq work
with v6 it all? It does work, but I can't explain it. Can anyone
out there?

(For grins, I surrounded the if_output call in the v6 code
with KERNEL_LOCK and KERNEL_UNLOCK_ONE. I would get the occassional
system crash and other improper behavior. Clearly, this code was
NOT meant to be surrounded by KERNEL_LOCK and KERNEL_UNLOCK_ONE.)



Timeouts:
Every timeout looks something like this:

mutex_enter(softnet_lock);
do processing required by timeout
mutex_exit(softnet_lock);



FAST_IPSEC:
AH, ESP, and ipcomp all have callbacks when crypto functions have completed
their work. They each have two callbacks, one for packets coming in, and
one for packets going out. These callbacks take softnet_lock, do their
processing, and release softnet_lock.

For incoming packets, this is consistent with locking strategy for the input
path. For IPv4 (*inetsw[ip_protox[prot]].pr_input) is called. For IPv6, (*inet6sw[ip6_protox[nxt]].pr_input) is called. Presumably, these calls
will end up calling ip_input and ip6_input, at least for tunnel mode,
because after removing headers, we end up with a new IP of IPv6 packet to
process.

The output_cb's call ipsec_process_done which calls ipsec_reinject_ipstack
which calls ip_output or ip6_output. KERNEL_LOCK and KERNEL_UNLOCK_ONE
surround the ip_output and ip6_output calls.


PF:
When pf is called on the input path, sofnet_lock is held.
When pf is called on the output path, I can't see any lock taken. If
it's a forwarded packet, then softnet_lock is held.
And when pf is configured, pf_consistency_lock is held, but no lock
is held when pf data structures are accessed from the data plane.


So does anyone understand how all this works together?

Thanks.

-Bev


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Laight
2013-03-12 08:09:04 UTC
Permalink
Post by Beverly Schwartz
I am asking, because with FAST_IPSEC and multi-core enabled, there
is a not-so-slow leak of MCL clusters.
Sometimes it can be helpful to identify the contents of the
leaked items.
Finding one can be tricky, but if the leak is bad enough any random
piece of memory will eventually be filled with the leaking item.

David
--
David Laight: ***@l8s.co.uk

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Beverly Schwartz
2013-03-12 13:37:38 UTC
Permalink
Post by David Laight
Post by Beverly Schwartz
I am asking, because with FAST_IPSEC and multi-core enabled, there
is a not-so-slow leak of MCL clusters.
Sometimes it can be helpful to identify the contents of the
leaked items.
Finding one can be tricky, but if the leak is bad enough any random
piece of memory will eventually be filled with the leaking item.
I am leaking from mclpl. mclpl has a limit, so once mclpl hits its
limit, functions which use mclpl stop working, but everything else
still runs. Once in that state, using vmstat -m, one can observe
the rising failure rate for mclpl requests.

Since mclpl has its own API, it is easy to find every place
that uses mclpl. However, what's not so easy is finding where
members from mclpl are not being released as expected. I
keep going back to having something to do with locking strategy,
that some sort of conflict is happening which causes mclpl
entries to not be freed when they normally would. Running
single-core, this does not occur at all. Using KAME_IPSEC,
this doesn't happen at all. Next tests I run, I will turn off
altq and pf to see if those are a factor in this. altq and
pf do not cause a problem if there is *no* active tunnel.

-Bev

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...