bpf jit

Discussion:

bpf jit

(too old to reply)

Alexander Nasonov

2012-08-19 15:06:40 UTC

I brought this up on source-changes-d@ before but I think the subject
should be discussed on a general list as well. I also cc to the author
of sljit Zoltán Herczeg.

Mindaugas recently committed bpf_jit (just-in-time) code generator to
the NetBSD i386 and amd64 kernels from FreeBSD code.
The new code is still broken (see below for details) and I don't know
when Mindaugas plans to fix it.

I'm finishing an alternative bpfjit library [1] which is based on sljit
library [2]. Sljit library provides a limited subset of registers and
instructions but this subset is machine independent (with few minor
exceptions). The subset is big enough to support all bpf instructions.

At the moment, sljit supports i386, amd64, arm, ppc32, ppc64 and mips32.
There is no sparc support because the author of sljit neither have time
nor hardware. I don't know a status of mips64.

I can confirm that sljit works on arm (it passes all bpfjit tests on
linux in userspace) and on amd64 (it passes all tests on -current in
userspace and it works in the kernel). Sljit kernel module is not big,
it's size is around 34K on amd64.

Performace of generated code is on par with hand written C code on both
arm and amd64.

In my opinion, sljit generator is better than hand-coded generators
for several reasons:

1) bpfjit code is machine independent while bpf_jit has to be written
for every architecture. This argument has an opposite side of course.
If sljit doesn't support some particular architecture, it's a lot
more work to implement all sljit API for the new architecture than to
code up bpf generator manually.

2) FreeBSD only supports i386 and amd64 at the moment. Some people
believe that there is a suport for some other arches but I don't
have any references to implementations. The best I could find is
a post by Robert Watson [3].

3) bpf_jit expects an umlimate knowdedge of instructions encoding and
this badly affects both readability and extensibility. For example,
there are jumps by 12 bytes in bpf_jit generator. In order to tell
what is a destication of a jump (and whether it's valid at all), you
need to know a binary encoding format. On the other hand, sljit
supports jumps integrity using jump and label objects.

4) FreeBSD doesn't support (quite likely for the above reason) reading
data from mbuf chain. They instead fallback to interpreter but it
leads to suboptimal performance for bigger packets and this makes
their implementation look a bit like a toy. Why do they need jit
compiler at all if they only implement it for small packets?

Current NetBSD code doesn't implement code generation for mbuf chain
either but it calls jit code nevertheless, it's a bug I mentioned at
the beginning of my post.

5) bpfjit supports mbuf chain. It didn't take long to implement mbuf
chain using sljit but it's harder to test mbuf chain in userspace
and I didn't test this functionality thoroughly yet.

6) sljit can be used for other things like npf_code.

Sljit has downsides, of course. I mentioned one already in bullet 1) but
there are some others:

- No manual page. Most functionality is documented in sljitLir.h, it
should be moved or copied to a manual page.
- API is not stable yet.
- No proper build script and .so versioning. It's less a problem for
the kernel space but I'd like to build sljit in userspace as well
to run unit tests.

I'd like to conclude my lengthy post with a question. Which of these two
implementations should be used by NetBSD?

Thanks,
Alex

[1] https://github.com/alnsn/bpfjit
[2] http://sljit.sourceforge.net
[3] http://lists.freebsd.org/pipermail/svn-src-head/2009-November/011836.html

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Martin Husemann

2012-08-19 18:22:40 UTC