Beverly Schwartz
2013-04-16 22:55:46 UTC
I have created a facility in the kernel for tracking mbuf clusters. (Twice at BBN, we have successfully used this cluster tracking code to find mbuf cluster leaks.)
This facility can be compiled in the kernel by enabling the option MCL_DEBUG.
If MCL_DEBUG is enabled, then tracking data is kept for each mbuf cluster. Examples of data kept:
When and where in the code the cluster was allocated.
When and where in the code the cluster was freed.
When and where the cluster was queued or dequeued.
When and where the cluster was passed from one protocol to another.
At each of these points, I note which CPU we're on, the LWP id, whether or not we have KERNEL_LOCK and/or softnet lock, if there was something anomalous. Anomalies detected: cluster allocated twice without being freed in-between, cluster freed without being allocated, cluster unallocated when expected to be allocated, lock not held when expected to be held.
I have set up code in /proc for accessing the data, but it would be nice to have a user space program to look at the data. Using kvm, this data can be inspected in a live kernel or in a core dump.
Keep in mind, there can be up to 8192 clusters, so there is potentially a *lot* of data. Using kvm, we can also follow pointers in the data to inspect the contents of mbuf's and mbuf clusters. I expect that this, too, could be quite useful.
Options I have considered:
- a new usr.bin program
- adding new options to vmstat for this data
- adding new options to netstat for this data
Any thoughts or preferences?
-Bev
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
This facility can be compiled in the kernel by enabling the option MCL_DEBUG.
If MCL_DEBUG is enabled, then tracking data is kept for each mbuf cluster. Examples of data kept:
When and where in the code the cluster was allocated.
When and where in the code the cluster was freed.
When and where the cluster was queued or dequeued.
When and where the cluster was passed from one protocol to another.
At each of these points, I note which CPU we're on, the LWP id, whether or not we have KERNEL_LOCK and/or softnet lock, if there was something anomalous. Anomalies detected: cluster allocated twice without being freed in-between, cluster freed without being allocated, cluster unallocated when expected to be allocated, lock not held when expected to be held.
I have set up code in /proc for accessing the data, but it would be nice to have a user space program to look at the data. Using kvm, this data can be inspected in a live kernel or in a core dump.
Keep in mind, there can be up to 8192 clusters, so there is potentially a *lot* of data. Using kvm, we can also follow pointers in the data to inspect the contents of mbuf's and mbuf clusters. I expect that this, too, could be quite useful.
Options I have considered:
- a new usr.bin program
- adding new options to vmstat for this data
- adding new options to netstat for this data
Any thoughts or preferences?
-Bev
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de