rfc: socketing it to gre

Discussion:

(too old to reply)

David Young

2007-09-13 04:08:17 UTC

I have made a patch against -current that makes gre(4) exclusively use
sockets for packet (dis)encapsulation and (de)muxing. In the new GRE
world order, you will not see gre(4) produce its own IP header, nor you
will find its hooks in the IP stack: sys/netinet/ip_gre.[ch] are no more!
The patch is at <ftp://cuw.ojctech.com/cuw/dyoung-9cb5d230/gre.patch>.

This is a work in progress. Right now, the only encapsulations I
support are GRE in UDP in IPv4, and GRE in IPv4. In the near future I
will support IPv6, and UDP in IPv6. In principle, you could use some
OSI/AppleTalk datagram protocol for the encapsulation, but that's not
supported quite yet.

I think that more of our tunnel interfaces deserve this socket treatment,
eventually: etherip, gif, and stf.

Review welcome. As always, I will appreciate your help with testing.

*** Some implementation details follow. ***

This patch adds a new protocol/address family-unaware socket option
that tells a socket that it should both add a protocol header to tx'd
datagrams and remove the header from rx'd datagrams. It helps me keep
knowledge of the encapsulating protocol *out* of gre(4). Only sockets
of type (AF_INET, SOCK_RAW) implement the option, right now. Example use:

int onoff = 1, s = socket(...);
setsockopt(s, SOL_SOCKET, SO_NOHEADER, &onoff);

I have also added a new subroutine, fsocreate(), that creates a socket
a la socreate(9) and sticks it into a process's file descriptor table.
I use it in both gre(4) and sys_socket(). gre sticks the sockets it owns
into process 0's file table, so it's clear which socket belongs to who.

Dave

--
David Young OJC Technologies
***@ojctech.com Urbana, IL * (217) 278-3933 ext 24

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Darren Reed

2007-09-13 06:24:09 UTC

Permalink

David,

Please don't put inline functions in header files, especially using
names that are all in capital letters and thus look like macros.

They're my favourite pet hate in Linux because people actually
put important functionality in them.

Also, as this work brings in a new socket type, I think it would
be better to bring in the socket changes first, independant of the
gre changes.

Darren

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

David Young

2007-09-14 17:37:57 UTC

Permalink

Post by Darren Reed
David,
Please don't put inline functions in header files, especially using
names that are all in capital letters and thus look like macros.

I don't see any harm in putting inline functions into header files,
especially not when they will be shared by many C files. I don't know
where else to put inline subroutines that C files will share; maybe you
can suggest some place. Lowering the case of macros turned to inlines
is a lot of pain for very little gain.

Dave

Darren Reed

2007-09-15 05:21:12 UTC

Permalink

Post by David Young

Post by Darren Reed
David,
Please don't put inline functions in header files, especially using
names that are all in capital letters and thus look like macros.

...why are they better served by being an inline function than a macro?

Darren

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Thor Lancelot Simon

2007-09-15 05:31:14 UTC

Permalink

Post by Darren Reed
...why are they better served by being an inline function than a macro?

The only thing that comes immediately to mind is that you get compile time
typechecking on the arguments to inline functions, instead of just any
functions the macros happen to call. But that's not a tremendous gain...

--
Thor Lancelot Simon ***@rek.tjls.com

"The inconsistency is startling, though admittedly, if consistency is to
be abandoned or transcended, there is no problem." - Noam Chomsky

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

David Young

2007-09-15 06:35:17 UTC

Permalink

Post by Darren Reed

Post by David Young

Post by Darren Reed
David,
Please don't put inline functions in header files, especially using
names that are all in capital letters and thus look like macros.

...why are they better served by being an inline function than a macro?

With an inline function, no backslash line continuations are necessary.
You get compile-time type checking, and useful line number information
for compile errors. Function arguments are evaluated once and only
once---that is, no surprise side-effects. Function arguments are "atoms"
(each use doesn't need to be parenthesized). You get a new scope, so
that arguments and temporary variables don't need leading underscores
added to keep them from conflicting with variables in the caller's
scope. Inline subroutines are contain less syntactic clutter (parens,
underscores, backslashes) so they are easier to read and to write.

I don't think that macros can be justified in new code, except where
a macro has a unique capability; for example, syntactic fiddling with
the # operator is needed. In old code, I don't think there is any use
preserving a macro if an inline will do.

Dave

der Mouse

2007-09-15 14:32:26 UTC

Permalink

Post by David Young

[...inlines vs macros...]

I don't think that macros can be justified in new code, except where
a macro has a unique capability; for example, syntactic fiddling with
the # operator is needed. In old code, I don't think there is any
use preserving a macro if an inline will do.

Unless the code *must* be inline for some reason - I think a macro is
the only way to *force* code inline. (I haven't seen this requirement
often, but I've seen it.)

/~\ The ASCII der Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents.montreal.qc.ca
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

David Laight

2007-09-15 23:24:54 UTC

Permalink

Post by David Young

Post by Darren Reed
David,
Please don't put inline functions in header files, especially using
names that are all in capital letters and thus look like macros.

I don't see any harm in putting inline functions into header files,
especially not when they will be shared by many C files.

Ok, I'll admit to not having looked at the code in question, but
if these functions are non-trivial inlining them may be a performance
loss. If trivial a #define will suffice!

David

--
David Laight: ***@l8s.co.uk

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Bill Stouder-Studenmund

2007-09-14 19:14:11 UTC

Permalink

Post by David Young
I have made a patch against -current that makes gre(4) exclusively use
sockets for packet (dis)encapsulation and (de)muxing. In the new GRE
world order, you will not see gre(4) produce its own IP header, nor you
will find its hooks in the IP stack: sys/netinet/ip_gre.[ch] are no more!
The patch is at <ftp://cuw.ojctech.com/cuw/dyoung-9cb5d230/gre.patch>.
This is a work in progress. Right now, the only encapsulations I
support are GRE in UDP in IPv4, and GRE in IPv4. In the near future I
will support IPv6, and UDP in IPv6. In principle, you could use some
OSI/AppleTalk datagram protocol for the encapsulation, but that's not
supported quite yet.

How much of a performance hit do we take for this?

Would this permit us to do netgraph-like mix&match in the future?

Take care,

Bill

David Young

2007-09-14 21:09:08 UTC

Permalink

Post by Bill Stouder-Studenmund

How much of a performance hit do we take for this?

I haven't measured. I will.

I believe we will see an improvement in performance at the receiver when
there are hundreds or thousands of tunnel interfaces, because the socket
demux code uses a hash instead of walking a list.

I believe the socket code will be a better place to address performance
problems than in code cut&pasted into umpteen different tunnel
pseudo-interfaces. Something I should have drawn more attention to in
my original email was that hundreds of lines of code can disappear;
I don't know if that will help somebody optimize NetBSD, but I don't
think that their job is getting more complicated. More on that, below.

Post by Bill Stouder-Studenmund
Would this permit us to do netgraph-like mix&match in the future?

Maybe so. What do you have in mind?

Let me tell you what I have in mind: I would like for there to be a tunnel
"superclass." Let us derive gif, gre, etherip from it by supplying a
method that adds/subtracts the tunnel "shim" between the outer header
and the inner packet. As before, the user sees gif, gre, or etherip,
but they are just personalities of the same code. I think that stf
and a hypothetical Teredo interface will withstand the same treatment,
but they are a special case.

Dave

Bill Stouder-Studenmund

2007-09-18 19:23:07 UTC

Permalink

Post by David Young

Post by Bill Stouder-Studenmund
How much of a performance hit do we take for this?

I haven't measured. I will.

Thanks!

Post by David Young
I believe we will see an improvement in performance at the receiver when
there are hundreds or thousands of tunnel interfaces, because the socket
demux code uses a hash instead of walking a list.

Cool!

Post by David Young
I believe the socket code will be a better place to address performance
problems than in code cut&pasted into umpteen different tunnel
pseudo-interfaces. Something I should have drawn more attention to in
my original email was that hundreds of lines of code can disappear;
I don't know if that will help somebody optimize NetBSD, but I don't
think that their job is getting more complicated. More on that, below.

As long as we don't have a punishingly-large performance hit, the cleaning
up in and of itself probably is a good enough reason to do this.

Post by David Young

Post by Bill Stouder-Studenmund
Would this permit us to do netgraph-like mix&match in the future?

Maybe so. What do you have in mind?

Nothing in particular. Just I remember hearing some about Netgraph being
able to create weird configuration topologies and it sounds like this
could also plumb stuff together.

Post by David Young
Let me tell you what I have in mind: I would like for there to be a tunnel
"superclass." Let us derive gif, gre, etherip from it by supplying a
method that adds/subtracts the tunnel "shim" between the outer header
and the inner packet. As before, the user sees gif, gre, or etherip,
but they are just personalities of the same code. I think that stf
and a hypothetical Teredo interface will withstand the same treatment,
but they are a special case.

Cool!

Take care,

Bill

YAMAMOTO Takashi

2007-09-19 11:18:24 UTC

Permalink

Post by David Young
I have also added a new subroutine, fsocreate(), that creates a socket
a la socreate(9) and sticks it into a process's file descriptor table.
I use it in both gre(4) and sys_socket(). gre sticks the sockets it owns
into process 0's file table, so it's clear which socket belongs to who.

why you want to use file descriptors, rather than pointers?
using process 0's file table for gre sounds weird to me.

YAMAMOTO Takashi

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

David Young

2007-09-19 18:50:32 UTC

Permalink

Post by YAMAMOTO Takashi

why you want to use file descriptors, rather than pointers?
using process 0's file table for gre sounds weird to me.

Using file descriptors lets us see who owns/shares a socket using
fstat(1). It is especially helpful for seeing if userland and the kernel
share a socket, just in case userland has delegated its socket to the
kernel using GRESSOCK.

Dave

David Young

2007-10-02 21:04:51 UTC

Permalink

Post by Bill Stouder-Studenmund

Post by David Young

Post by Bill Stouder-Studenmund
How much of a performance hit do we take for this?

I haven't measured. I will.

Thanks!

I have measured the TCP performance with iperf, now. Initial tests
showed a significant slowdown. A 133-MHz Soekris net4521,

cpu0: AMD Am5x86 W/B 133/160 (486-class), id 0x4f4

could sink a TCP stream over fully-socketized gre(4) at 75% the speed
of the gre(4) in -current today. The same box could source a TCP stream
over fully-socketized gre(4) at 90% the speed of gre(4) in -current today.

(The other end in all my tests is a Pentium 4 running -current from
January.)

The net4521 is very slow by today's standards. To get an idea of the
performance on a faster machine, I switched to a net4801,

cpu0: National Semiconductor Geode GX1 (586-class), 266.62 MHz, id 0x540

Performance was better, but dissatisfying. Taking some advice from
Matt Thomas and Andrew Doran, I made many optimizations: I stopped
waking a kernel thread to call sosend() and soreceive(); instead,
I use slimmed-down copies of sosend() and soreceive() that I can run
directly from a software interrupt and from a socket upcall, respectively.
The kernel thread is completely out of the xmit and recv paths, and on the
net4801, the slowdown of socketized gre(4) is IMO acceptable, at about 96%
of -current in both directions. I will test on the net4521 in a while.

Your thoughts?

Post by Bill Stouder-Studenmund

Cool!

I was wrong about this: while UDP sockets do use a hash, raw IP sockets
do not. Nevertheless, raw IP sockets *should* use a hash, and I have
been reading the code to find a way to do it where raw/UDP/TCP sockets
can share the most code.

Dave