Discussion:
16 year old bug
(too old to reply)
Christoph Egger
2010-08-23 11:53:40 UTC
Permalink
... has been found by OpenBSD:

Their commit message:
--------------------------------------------
Fix a 16 year old bug in the sorting routine for non-contiguous netmasks.
For masks of identical length rn_lexobetter() did not stop on the
first non-equal byte. This leads rn_addroute() to not detecting
duplicate entries and thus we might create a very long list of masks
to check for each node.
This can have a huge impact on IPsec performance, where non-contiguous
masks are used for the flow lookup. In a setup with 1300 flows we
saw 400 duplicate masks and only a third of the expected throughput.
--------------------------------------------

The patch is attached. Any comments?

Christoph
Antti Kantee
2010-08-23 12:03:45 UTC
Permalink
Post by Christoph Egger
--------------------------------------------
Fix a 16 year old bug in the sorting routine for non-contiguous netmasks.
For masks of identical length rn_lexobetter() did not stop on the
first non-equal byte. This leads rn_addroute() to not detecting
duplicate entries and thus we might create a very long list of masks
to check for each node.
This can have a huge impact on IPsec performance, where non-contiguous
masks are used for the flow lookup. In a setup with 1300 flows we
saw 400 duplicate masks and only a third of the expected throughput.
--------------------------------------------
The patch is attached. Any comments?
The test for this is missing.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Tom Spindler
2010-08-23 12:48:58 UTC
Permalink
Post by Christoph Egger
Fix a 16 year old bug in the sorting routine for non-contiguous netmasks.
[...]

Does our IPSEC code actually _use_ non-continguous netmasks? While RFC950
technically allows them, they're "recommended against", and most modern
network hardware will turn their nose up at them AFAIK.

(Note that I'm not saying that there isn't a bug in the way this routine
is used - but if non-contiguous netmasks are used elsewhere, I'd be very
surprised if other pieces of code also were not similarly 'buggy'.)


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Matt Thomas
2010-08-25 00:53:38 UTC
Permalink
Post by Joerg Sonnenberger
That's silly. A bitmask is a bitmask, and there's nothing magical or
difficult about masked compare. Even the bug OpenBSD just fixed -- now
that it basically doesn't matter any more -- is hardly complex nor is
the fix so.
The issue with non-cont netmask is that it dramatically complicates the
lookup code. I'd say that at least 1/3 of the radix tree implementation
is just related to this "feature".
Even worse, it's inefficient on newer hardware. Most platforms have a
count-leading operation which dramatically increases the lookups. Also
knowing the datatype and using datatype specific comparison speeds it
up even more.

I've been removing the use of radix and switching to ptree in the network
rework.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Mindaugas Rasiukevicius
2010-08-25 01:09:37 UTC
Permalink
Post by Matt Thomas
Post by Joerg Sonnenberger
That's silly. A bitmask is a bitmask, and there's nothing magical or
difficult about masked compare. Even the bug OpenBSD just fixed -- now
that it basically doesn't matter any more -- is hardly complex nor is
the fix so.
The issue with non-cont netmask is that it dramatically complicates the
lookup code. I'd say that at least 1/3 of the radix tree implementation
is just related to this "feature".
Even worse, it's inefficient on newer hardware. Most platforms have a
count-leading operation which dramatically increases the lookups. Also
knowing the datatype and using datatype specific comparison speeds it
up even more.
I've been removing the use of radix and switching to ptree in the network
rework.
Seems like there are good reasons to kill that code, especially the code
complexity. I am also keen to see your ptree-based code.
--
Mindaugas

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Thor Lancelot Simon
2010-08-23 14:13:04 UTC
Permalink
Post by Christoph Egger
--------------------------------------------
Fix a 16 year old bug in the sorting routine for non-contiguous netmasks.
I suggest removing support for non-contiguous netmasks. They are
Then please find another way for IPsec to match packets, before you rip
out the one it uses now.

Thor

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Tom Spindler
2010-08-23 14:22:37 UTC
Permalink
Post by Thor Lancelot Simon
Post by Christoph Egger
Fix a 16 year old bug in the sorting routine for non-contiguous netmasks.
I suggest removing support for non-contiguous netmasks. They are
Then please find another way for IPsec to match packets, before you rip
out the one it uses now.
If you can't rely on the masks being contiguous, why not use (e.g.)
heapsort instead? I'd think the whole reason you have the specific
netmask-sorting algo is for the optimized case where you _do_ have
contiguous netmasks; if that doesn't hold, though, you may as well
use something that's already in the kernel and (presumably) tuned.


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2010-08-23 14:40:20 UTC
Permalink
Post by Tom Spindler
Post by Christoph Egger
Fix a 16 year old bug in the sorting routine for non-contiguous netmasks.
Does our IPSEC code actually _use_ non-continguous netmasks?
I haven't looked at the IPsec code, so this is a guess, but the wording
makes it sound as though this is an implementation technique used
internally by IPsec rather than being the externally-visible use of
noncontiguous netmasks everyone seems to be taking it for.

That said,
Post by Tom Spindler
and most modern network hardware will turn their nose up at them
AFAIK.
IMO anything that pretends to implement IPv4 but which doesn't do
noncontiguous netasks is simply broken, I don't care whether it comes
from Cisco or Netgear or NetBSD.

Not, I suppose, that anyone necessarily cares what I consider broken.

Slow-path them. Require a sysctl switch (the way we do for source
routes). Fine. But outright desupport them? I'd call that a bug,
even if it is done deliberately.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Johnny Billquist
2010-08-23 15:13:05 UTC
Permalink
Post by der Mouse
Post by Tom Spindler
and most modern network hardware will turn their nose up at them
AFAIK.
IMO anything that pretends to implement IPv4 but which doesn't do
noncontiguous netasks is simply broken, I don't care whether it comes
from Cisco or Netgear or NetBSD.
Not, I suppose, that anyone necessarily cares what I consider broken.
Slow-path them. Require a sysctl switch (the way we do for source
routes). Fine. But outright desupport them? I'd call that a bug,
even if it is done deliberately.
I believe that non-contiguous netmasks actually are illegal nowadays.
They became illegal when CIDR was implemented.

That said, it might be worth having a way to enable the legacy view of
network address classes and netmasks, if someone wants to...?

Johnny

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2010-08-24 01:48:40 UTC
Permalink
Post by Johnny Billquist
I believe that non-contiguous netmasks actually are illegal nowadays.
Cite?
Post by Johnny Billquist
They became illegal when CIDR was implemented.
Implemented? I doubt it. Standardized, at most. But even then, it
would take years to eliminate everything that supports them - indeed, I
just now tried it and find that NetBSD 4.0.1 appears to support them.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Johnny Billquist
2010-08-24 07:54:52 UTC
Permalink
Post by der Mouse
Post by Johnny Billquist
I believe that non-contiguous netmasks actually are illegal nowadays.
Cite?
RFC 4632 (CIDR Address Strategy), section 5.1:

" An implementation following these rules should also be generalized,
so that an arbitrary network number and mask are accepted for all
routing destinations. The only outstanding constraint is that the
mask must be left contiguous."

I haven't bothered checking RFC 1519, but I would expect the same thing
being said there. RFC 4632 obsoletes RFC 1519, so RFC 1519 is only of
historical interest today anyway.
Post by der Mouse
Post by Johnny Billquist
They became illegal when CIDR was implemented.
Implemented? I doubt it. Standardized, at most. But even then, it
would take years to eliminate everything that supports them - indeed, I
just now tried it and find that NetBSD 4.0.1 appears to support them.
Implemented was the wrong word. Deployed is probably more correct.

Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: ***@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Rhialto
2010-09-02 15:31:30 UTC
Permalink
Post by Johnny Billquist
Post by Johnny Billquist
I believe that non-contiguous netmasks actually are illegal nowadays.
Cite?
...which is titled "Rules for Route Advertisement". (Also, 4632 is a
BCP, not a standard.)
Post by Johnny Billquist
" An implementation following these rules should also be generalized,
so that an arbitrary network number and mask are accepted for all
routing destinations. The only outstanding constraint is that the
mask must be left contiguous."
With respect to route aggregation in advertisements (ie,
exterally-visible behaviour). See the second paragraph of 5.2.
Also (thinking perversely), this never says that the 1-bits in the mask
must be left-justified... a mask of 00111111 11111111 11000000 00000000
has the whole mask contiguous...
/~\ The ASCII Mouse
-Olaf.
--
___ Olaf 'Rhialto' Seibert -- There's no point being grown-up if you
\X/ rhialto/at/xs4all.nl -- can't be childish sometimes. -The 4th Doctor

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2010-08-24 04:02:42 UTC
Permalink
Was [running my house LAN with a noncontiguous netmask], for
practical purposes, unsupportable? Was it something likely to cause
subtle bugs all over the networking stack? Was it something
obsoleted more or less 20 years ago? All yes.
Actually, no.

Unsupportable? I don't see anything unsupportable about it. Every
system I tried (which admittedly wasn't all that many) supported it
fine. Even today, I tried NetBSD 4.0.1 (the most recent I have easy
admin access to) and it appeared to support it as well as whatever I
was using at the time did - though admittedly I didn't actually verify
that packets were routed the way the resulting routing table implied.

Likely to cause bugs? Nonsense. Likely to expose existing bugs,
perhaps. Do you not consider exposing existing bugs a good thing?
I know I certainly do.

Obsoleted 20 years ago? Perhaps. Strikes me as pretty functional and
useful for an "obsoleted" feature. Besides, this _was_ 20 years ago -
well, actually more like 15±5; I didn't have much of a house LAN
before maybe 1991, and I stopped using the address space this was
embedded in sometime around 2000-2001.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Steven Bellovin
2010-08-24 04:18:00 UTC
Permalink
Post by der Mouse
Was [running my house LAN with a noncontiguous netmask], for
practical purposes, unsupportable? Was it something likely to cause
subtle bugs all over the networking stack? Was it something
obsoleted more or less 20 years ago? All yes.
Actually, no.
Unsupportable? I don't see anything unsupportable about it. Every
system I tried (which admittedly wasn't all that many) supported it
fine. Even today, I tried NetBSD 4.0.1 (the most recent I have easy
admin access to) and it appeared to support it as well as whatever I
was using at the time did - though admittedly I didn't actually verify
that packets were routed the way the resulting routing table implied.
Likely to cause bugs? Nonsense. Likely to expose existing bugs,
perhaps. Do you not consider exposing existing bugs a good thing?
I know I certainly do.
Obsoleted 20 years ago? Perhaps. Strikes me as pretty functional and
useful for an "obsoleted" feature. Besides, this _was_ 20 years ago -
well, actually more like 15±5; I didn't have much of a house LAN
before maybe 1991, and I stopped using the address space this was
embedded in sometime around 2000-2001.
The problem is, as has been noted, the lack of a good definition of the routing table with mixed prefixes. If everyone uses a mask of, say, 0xA596695A, it all just works. But if some routers use 0xA95696A5 and others use 0xA596695A, the semantics are unclear.

Variable-length masks are not simply a matter of the enterprise/ISP boundary. They can and do occur within an organization. My own department has at least 3 different prefix lengths. And that problem is old -- I'm sure other folks who've been in this racket for a while remember the SUBNETSARELOCAL kernel configuration option.

Non-contiguous masks can indeed be useful, albeit only in specialized topologies and networks. I could have used them in a paper I published just 1.5 years ago. The trouble is that they conflicted with the routing table definition necessary for CIDR, and CIDR was and is necessary for the survival of the Internet.

None of this, however, has any relationship to what the original poster said, which is that the current code is also used in IPsec and has a performance bug. And *that* is completely unrelated to whether or not non-contiguous masks are a good idea!


--Steve Bellovin, http://www.cs.columbia.edu/~smb






--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Joerg Sonnenberger
2010-08-24 07:25:10 UTC
Permalink
That's silly. A bitmask is a bitmask, and there's nothing magical or
difficult about masked compare. Even the bug OpenBSD just fixed -- now
that it basically doesn't matter any more -- is hardly complex nor is
the fix so.
The issue with non-cont netmask is that it dramatically complicates the
lookup code. I'd say that at least 1/3 of the radix tree implementation
is just related to this "feature".

Joerg

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Joerg Sonnenberger
2010-08-24 07:27:38 UTC
Permalink
I have. For a significant time (years) I was running my house LAN with
a netmask ending in (binary) 11011000, I think it was - a /29 expanded
by adding a second /29 from higher up. (The memory is very fuzzy, but
255.255.255.216 looks right.)
The reason was exactly this: growing the space without renumbering when
the original space's pair had alreayd been allocated elsewhere. Was it
necessary? Not for most values of "necessary". Was it useful?
Definitely. Not visible outside its parent network, of course, but
that's true of most subnetting schemes, including CIDR ones, and it was
in live use for years.
Pretty much the same result can be obtained by running something like
Roy's parpd on the router and just configuring the end machines for the
bigger subnet.

Joerg

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Perry E. Metzger
2010-08-24 13:47:22 UTC
Permalink
On Mon, 23 Aug 2010 23:21:37 -0400 Thor Lancelot Simon
On Mon, 23 Aug 2010 21:46:16 -0400 (EDT) der Mouse
The reason was exactly this: growing the space without
renumbering when the original space's pair had alreayd been
allocated elsewhere. Was it necessary? Not for most values of
"necessary". Was it useful? Definitely.
Was it, for practical purposes, unsupportable? Was it something
likely to cause subtle bugs all over the networking stack? Was it
something obsoleted more or less 20 years ago? All yes.
That's silly. A bitmask is a bitmask, and there's nothing magical
or difficult about masked compare.
It is difficult to reason about the effects of non-contiguous
bitmasks when dealing with security critical code. What happens when
one is using bitmasks as part of generating data structures in filter
code? What happens when you deal with non-contiguous blocks as part of
building routing data structures?

There are loads of places where odd effects can happen. Code that
is rarely or ever used is code where dragons lurk.
I could care less whether support for noncontiguous subnet masks
were to disappear,
Then don't argue against it.
but I would strongly prefer that nothing _else_
in the system that relies on the IP stack supporting them be
needlessly broken in the process just so we can say we're modern
and stylish. That's just irresponsible.
I don't think anyone was arguing in favor of breaking the stack in
order to be stylish. I think people were arguing in favor of getting
rid of a whole class of problems by doing input validation.
--
Perry E. Metzger ***@piermont.com

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
der Mouse
2010-08-24 15:41:06 UTC
Permalink
I wouldn't say _nothing_. See below.
That's why I said "essentially nothing" - for your two /29's, you
must have had a max of 14 hosts. You could have renumbered those in
less than half an hour, even if you had to manually do it to every
one.
Well, there were external pointers to at least a few of those
addresses. Fixing even just the ones I knew about would have been
nontrivial. Indeed, that was one of the most sigificant aspects of
switching away from that address space when I finally did; fortunately,
I had the opportunity to run the old and new space in parallel on the
same broadcast domain for a substantial time, making it much less pain
to switch the external pointers.
That is, implementations are free to to whatever they like (these
days) if you use non-contig masks.
Sure; it's a quality-of-implementation issue, same as, say, accepting
only (in today's terminology) mask widths that are multiples of 8 bits.
And yes, I ran into some such implementations back in the '80s; even
then, we considered them annoyingly broken, much as I'd consider an
implementation that misbehaved in the presence of noncontiguous masks.
Or one that couldn't be configured to obey source routes. Or any of
endless other issues.

I recall talking with someone, once, who was involved in interop
testing back in the '80s/'90s, before support for masks other than /8,
/16, and /24 became "universal". The tale as I remember it was that
he'd tell vendors that "the mask on the show floor network is
255.255.252.0", with them looking worried; when feeling evil, he'd then
tell them "and next year it'll be 255.255.250.0", followed by their
(figuratively) running screaming, with a speed dependent on their
ability to convert decimal to binary. (I don't know how much truth
there is in it; I suspect they never actually did that. Pity.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
John Nemeth
2010-08-24 21:05:44 UTC
Permalink
On Dec 9, 11:48pm, der Mouse wrote:
}
} > That is, implementations are free to do whatever they like (these
} > days) if you use non-contig masks.
}
} Sure; it's a quality-of-implementation issue, same as, say, accepting
} only (in today's terminology) mask widths that are multiples of 8 bits.

In order to properly do CIDR you also need to do VLSM (Variable
Length Subnet Masking). In the year 2010, any system that couldn't do
that would be considered so broken as to be essentially useless. On
the other hand, in the year 2010, you can get away without
non-contiguous subnet masks 99.99% of the time.

} And yes, I ran into some such implementations back in the '80s; even
} then, we considered them annoyingly broken, much as I'd consider an
} implementation that misbehaved in the presence of noncontiguous masks.
} Or one that couldn't be configured to obey source routes. Or any of
} endless other issues.
}
} I recall talking with someone, once, who was involved in interop
} testing back in the '80s/'90s, before support for masks other than /8,
} /16, and /24 became "universal". The tale as I remember it was that
} he'd tell vendors that "the mask on the show floor network is
} 255.255.252.0", with them looking worried; when feeling evil, he'd then

This is a /22, not a big deal.

} tell them "and next year it'll be 255.255.250.0", followed by their

This on the otherhand, I can't translate instantly, but I do know
that it is non-contiguous.

} (figuratively) running screaming, with a speed dependent on their
} ability to convert decimal to binary. (I don't know how much truth
} there is in it; I suspect they never actually did that. Pity.)
}
}-- End of excerpt from der Mouse

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...