Discussion:
dropped routing socket messages...
(too old to reply)
J.T. Conklin
2008-05-16 18:16:53 UTC
Permalink
I have a system that uses a user-space routing table implementation.
It uses the kernel routing table as a forwarding table, and installs
and removes routes as needed by the higher level routing policy. It
also opens a routing socket and monitors it for addresses being added
and deleted from interfaces, interface link state, etc.

Occasionally (less than 1% of system restarts), it seems that routing
table events are being lost which results in my user-space routing
table getting out of sync with reality.

Under what circumstances would this happen? My initial hypotheses were
that the either kernel couldn't allocate an mbuf for the event, or the
routing socket receive buffer wasn't large enough and the even was
dropped then. But I don't see any "requests for memory denied" in
the mbuf stats (netstat -m); and I've set the socket buffer size to
128K, and the total number of events is way lower than that.

Is there any other likely reasons where routing socket events would
be dropped? FWIW, this is a NetBSD-4 kernel.

--jtc
--
J.T. Conklin

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Christos Zoulas
2008-05-16 20:04:22 UTC
Permalink
Post by J.T. Conklin
I have a system that uses a user-space routing table implementation.
It uses the kernel routing table as a forwarding table, and installs
and removes routes as needed by the higher level routing policy. It
also opens a routing socket and monitors it for addresses being added
and deleted from interfaces, interface link state, etc.
Occasionally (less than 1% of system restarts), it seems that routing
table events are being lost which results in my user-space routing
table getting out of sync with reality.
Under what circumstances would this happen? My initial hypotheses were
that the either kernel couldn't allocate an mbuf for the event, or the
routing socket receive buffer wasn't large enough and the even was
dropped then. But I don't see any "requests for memory denied" in
the mbuf stats (netstat -m); and I've set the socket buffer size to
128K, and the total number of events is way lower than that.
Is there any other likely reasons where routing socket events would
be dropped? FWIW, this is a NetBSD-4 kernel.
No, I don't know but I would add some debugging code in the error paths.

christos


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dennis Ferguson
2008-05-16 20:45:02 UTC
Permalink
Post by J.T. Conklin
I have a system that uses a user-space routing table implementation.
It uses the kernel routing table as a forwarding table, and installs
and removes routes as needed by the higher level routing policy. It
also opens a routing socket and monitors it for addresses being added
and deleted from interfaces, interface link state, etc.
Occasionally (less than 1% of system restarts), it seems that routing
table events are being lost which results in my user-space routing
table getting out of sync with reality.
Under what circumstances would this happen? My initial hypotheses were
that the either kernel couldn't allocate an mbuf for the event, or the
routing socket receive buffer wasn't large enough and the even was
dropped then. But I don't see any "requests for memory denied" in
the mbuf stats (netstat -m); and I've set the socket buffer size to
128K, and the total number of events is way lower than that.
Is there any other likely reasons where routing socket events would
be dropped? FWIW, this is a NetBSD-4 kernel.
There seems to be another queue between hard interrupt and soft
interrupt
level where stuff can also be dropped. See route_enqueue().

I'd note, however, that the routing socket's best effort design makes
it inherently unreliable, or at least unscalable, for tracking state
like
that. I've worked on very large systems which attempted to use the
routing
socket for that, and the solution for getting a reliable outcome always
ended up being redesigning all the data structures which were associated
with routing socket messages so that state changes could be tracked in
the structures themselves and messages were only formatting and
delivered
when the application asked for them. This is a lot of work.

Dennis Ferguson


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...