Discussion:
Multipath
(too old to reply)
Mihai Chelaru
2007-08-23 07:51:55 UTC
Permalink
Hi,

I'm working on a small patch [1] to get route multipath working in order to
make some private tests. While I'm not happy with the approach (chained
rtentries), I did try to stay away from radix tree implementation because I
didn't feel very comfortable working directly there :)

But now I wonder how should a rtsock response to a RTM_GET query look like if
there are multiple paths for the same destination ? For now I return only one
path in a round-robin fashion but I don't know if this is the correct thing
to do. I wonder how other BSDs are doing this.

[1] - http://kefren.ngnetworks.ro/multipath.diff
--
Thanks,
Mihai

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Young
2007-08-23 08:59:09 UTC
Permalink
Post by Mihai Chelaru
Hi,
I'm working on a small patch [1] to get route multipath working in order to
make some private tests. While I'm not happy with the approach (chained
rtentries), I did try to stay away from radix tree implementation because I
didn't feel very comfortable working directly there :)
It is good that you stayed away from the radix trie, and you should stay
far away. :-) Both experience and another developer have convinced me
that it is a bad place to add this function.
Post by Mihai Chelaru
But now I wonder how should a rtsock response to a RTM_GET query look like if
there are multiple paths for the same destination ? For now I return only one
path in a round-robin fashion but I don't know if this is the correct thing
to do. I wonder how other BSDs are doing this.
At least at one time, OpenBSD used RADIX_MPATH. Besides the problem
that RADIX_MPATH fiddles with the radix trie, I found a couple of other
problems that arose when you had a mixture of direct & indirect routes
(link-level nexthop or IP nexthop) to the same destination.

IMO, RTM_GET should return the route that the kernel would choose, given
the information you provide (including source address...), or else an
error if its choice is ambiguous or inconstant.

BTW, round-robin is not ordinarily a good multipath routing policy.
RADIX_MPATH references gateway "selection by Modulo-N Hash (RFC2991)".
Check that out.
Post by Mihai Chelaru
[1] - http://kefren.ngnetworks.ro/multipath.diff
I see a couple of places where you clear the RNF_ACTIVE bit in
rtrequest1(); resist the temptation. You also memcpy the radix_nodes.
Do not touch the radix trie data structures directly, but use the
radix_node_head methods, instead. I suggest doing a delete+add with
the old and new "head" rtentry.

I don't think that you should RTFREE() chained rtentries in rtflush(),
unless you are going to increase the reference count on an rtentry's
chained rtentries when you rtcache() it.

Dave
--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mihai Chelaru
2007-08-23 10:29:05 UTC
Permalink
Post by David Young
BTW, round-robin is not ordinarily a good multipath routing policy.
RADIX_MPATH references gateway "selection by Modulo-N Hash (RFC2991)".
Check that out.
Thanks for pointing me that. The problem with such implementation is that it
is protocol depedant. I'll implement few algorithms including this header
hashing next days but probably I'll have to leave round-robin as default
option.

Also I have to implement unequal ballancing but I don't think that the current
rt_metrics struct can help me and probably I'll have to add another variable
there.
Post by David Young
I see a couple of places where you clear the RNF_ACTIVE bit in
rtrequest1(); resist the temptation.
I avoid there a call to rn_delete1 in order to keep radix node alive and
RNF_ACTIVE must be cleared for rtfree() afterwards. Isn't it OK ?
Post by David Young
You also memcpy the radix_nodes.
This is a must in order to avoid double allocation of netmask. Or isn't it ?
Post by David Young
Do not touch the radix trie data structures directly, but use the
radix_node_head methods, instead. I suggest doing a delete+add with
the old and new "head" rtentry.
Are you talking about rtentry memcpy in rtfree() ? I was thinking that it's
faster and harmless.
Post by David Young
I don't think that you should RTFREE() chained rtentries in rtflush(),
unless you are going to increase the reference count on an rtentry's
chained rtentries when you rtcache() it.
ewww... I shouldn't traverse the list there. Thanks for pointing me that.
--
Thanks,
Mihai

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Thor Lancelot Simon
2007-08-23 16:57:10 UTC
Permalink
Post by Mihai Chelaru
Post by David Young
BTW, round-robin is not ordinarily a good multipath routing policy.
RADIX_MPATH references gateway "selection by Modulo-N Hash (RFC2991)".
Check that out.
Thanks for pointing me that. The problem with such implementation is that it
is protocol depedant.
It has to be. It can't really work right otherwise. Other protocols
that care about avoiding path asymmetry when multiple paths between two
nodes in the network are in use define part of the network-layer protocol
header for path selection (see SS7 MTP for an example, and particularly the
treatment of the SLS field by most national variants) but IP doesn't so
you cannot really avoid this issue.

This is one thing that, in my opinion at least, should have been fixed
in IPv6. But it is a little late now!
--
Thor Lancelot Simon ***@rek.tjls.com

"The inconsistency is startling, though admittedly, if consistency is to
be abandoned or transcended, there is no problem." - Noam Chomsky

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Young
2007-09-09 01:40:47 UTC
Permalink
Post by Mihai Chelaru
Post by David Young
I see a couple of places where you clear the RNF_ACTIVE bit in
rtrequest1(); resist the temptation.
I avoid there a call to rn_delete1 in order to keep radix node alive and
RNF_ACTIVE must be cleared for rtfree() afterwards. Isn't it OK ?
Post by David Young
You also memcpy the radix_nodes.
This is a must in order to avoid double allocation of netmask. Or isn't it ?
You need to take care to fix up embedded pointers, and pointers to the
radix_nodes, but you have not.
Post by Mihai Chelaru
Post by David Young
Do not touch the radix trie data structures directly, but use the
radix_node_head methods, instead. I suggest doing a delete+add with
the old and new "head" rtentry.
Are you talking about rtentry memcpy in rtfree() ? I was thinking that it's
faster and harmless.
I urge you to wait to optimize for speed and space until after your code
works. :-)

Do not rely on implementation details of the route table, because they
will change. Instead, use the accessor functions the table provides:
rn_delete, rn_addroute, et cetera. (I know that the existing code is not
perfectly consistent about this, but gradually I have been making it so.)

Dave
--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...