Discussion:
pfsync patch
(too old to reply)
Arnaud Degroote
2009-08-26 12:09:56 UTC
Permalink
Another call for review for pfsync integration. Work was done as part of my
GSoC. It integrates pfsync from OpenBSD 4.2, and I add support for userland
tools ifconfig, netstat, tcpdump, and pcap.

I would like to have review for the patch. If nobody objects, I would like to
integrate in the main tree next week.

Regards,
--
Arnaud Degroote
***@netbsd.org
Christos Zoulas
2009-08-26 21:27:51 UTC
Permalink
-=-=-=-=-=-
Another call for review for pfsync integration. Work was done as part of my
GSoC. It integrates pfsync from OpenBSD 4.2, and I add support for userland
tools ifconfig, netstat, tcpdump, and pcap.
I would like to have review for the patch. If nobody objects, I would like to
integrate in the main tree next week.
Regards,
--
Arnaud Degroote
+ DLT_CHOICE(DLT_PFSYNC, "Packet filter state syncinc"),
syncing?
DLT_CHOICE(DLT_PRISM_HEADER, "802.11 plus Prism header"),
DLT_CHOICE(DLT_IP_OVER_FC, "RFC 2625 IP-over-Fibre Channel"),
DLT_CHOICE(DLT_SUNATM, "Sun raw ATM"),
diff --git a/dist/pf/share/man/man4/pf.4 b/dist/pf/share/man/man4/pf.4
index 446a0d9..7051085 100644
--- a/dist/pf/share/man/man4/pf.4
+++ b/dist/pf/share/man/man4/pf.4
@@ -1131,7 +1131,7 @@ main(int argc, char *argv[])
.Xr ioctl 2 ,
.Xr bridge 4 ,
.Xr pflog 4 ,
-.\" .Xr pfsync 4 ,
+.Xr pfsync 4 ,
.Xr pfctl 8 ,
.Xr altq 9
.Sh HISTORY
diff --git a/dist/pf/share/man/man4/pfsync.4 b/dist/pf/share/man/man4/pfsync.4
new file mode 100644
index 0000000..c5adc06
--- /dev/null
+++ b/dist/pf/share/man/man4/pfsync.4
@@ -0,0 +1,245 @@
+.\" $OpenBSD: pfsync.4,v 1.25 2007/05/31 19:19:51 jmc Exp $
+.\"
+.\" Copyright (c) 2002 Michael Shalayeff
+.\" Copyright (c) 2003-2004 Ryan McBride
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" 1. Redistributions of source code must retain the above copyright
+.\" notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\" notice, this list of conditions and the following disclaimer in the
+.\" documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+.\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF MIND,
+.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+.\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+.\"
+.Dd $Mdocdate: May 31 2007 $
+.Dt PFSYNC 4
+.Os
+.Sh NAME
+.Nm pfsync
+.Nd packet filter state table logging interface
+.Sh SYNOPSIS
+.Cd "pseudo-device pfsync"
+.Sh DESCRIPTION
+The
+.Nm
+interface is a pseudo-device which exposes certain changes to the state
+table used by
+.Xr pf 4 .
+State changes can be viewed by invoking
+.Xr tcpdump 8
+on the
+.Nm
+interface.
+If configured with a physical synchronisation interface,
+.Nm
+will also send state changes out on that interface using IP multicast,
+and insert state changes received on that interface from other systems
+into the state table.
+.Pp
+By default, all local changes to the state table are exposed via
+.Nm .
+However, state changes from packets received by
+.Nm
+over the network are not rebroadcast.
+States created by a rule marked with the
+.Ar no-sync
+keyword are omitted from the
+.Nm
+interface (see
+.Xr pf.conf 5
+for details).
+.Pp
+The
+.Nm
+interface will attempt to collapse multiple updates of the same
+state into one message where possible.
+The maximum number of times this can be done before the update is sent out
+is controlled by the
+.Ar maxupd
+parameter to ifconfig
+(see
+.Xr ifconfig 8
+and the example below for more details).
+.Pp
+Each packet retrieved on this interface has a header associated
+with it of length
+.Dv PFSYNC_HDRLEN .
+The header indicates the version of the protocol, address family,
+action taken on the following states, and the number of state
+table entries attached in this packet.
+This structure is defined in
+.Aq Pa net/if_pfsync.h
+.Bd -literal -offset indent
+struct pfsync_header {
+ u_int8_t version;
+ u_int8_t af;
+ u_int8_t action;
+ u_int8_t count;
+};
+.Ed
+.Sh NETWORK SYNCHRONISATION
+States can be synchronised between two or more firewalls using this
+interface, by specifying a synchronisation interface using
+.Xr ifconfig 8 .
+For example, the following command sets fxp0 as the synchronisation
+.Bd -literal -offset indent
+# ifconfig pfsync0 syncdev fxp0
+.Ed
+.Pp
+By default, state change messages are sent out on the synchronisation
+interface using IP multicast packets.
+The protocol is IP protocol 240, PFSYNC, and the multicast group
+used is 224.0.0.240.
+When a peer address is specified using the
+.Ic syncpeer
+keyword, the peer address is used as a destination for the pfsync traffic,
+and the traffic can then be protected using
+.Xr ipsec 4 .
+In such a configuration, the syncdev should be set to the
+.Xr enc 4
+interface, as this is where the traffic arrives when it is decapsulated,
+.Bd -literal -offset indent
+# ifconfig pfsync0 syncpeer 10.0.0.2 syncdev enc0
+.Ed
+.Pp
+It is important that the pfsync traffic be well secured
+as there is no authentication on the protocol and it would
+be trivial to spoof packets which create states, bypassing the pf ruleset.
+Either run the pfsync protocol on a trusted network \- ideally a network
+dedicated to pfsync messages such as a crossover cable between two firewalls,
+or specify a peer address and protect the traffic with
+.Xr ipsec 4 .
+.Pp
+There is a one-to-one correspondence between packets seen by
+.Xr bpf 4
+on the
+.Nm
+interface, and packets sent out on the synchronisation interface, i.e.\&
+a packet with 4 state deletion messages on
+.Nm
+means that the same 4 deletions were sent out on the synchronisation
+interface.
+However, the actual packet contents may differ as the messages
+sent over the network are "compressed" where possible, containing
+only the necessary information.
+.Sh EXAMPLES
+.Nm
+and
+.Xr carp 4
+can be used together to provide automatic failover of a pair of firewalls
+configured in parallel.
+One firewall handles all traffic \- if it dies or
+is shut down, the second firewall takes over automatically.
+.Pp
+Both firewalls in this example have three
+.Xr sis 4
+interfaces.
+sis0 is the external interface, on the 10.0.0.0/24 subnet; sis1 is the
+internal interface, on the 192.168.0.0/24 subnet; and sis2 is the
+.Nm
+interface, using the 192.168.254.0/24 subnet.
+A crossover cable connects the two firewalls via their sis2 interfaces.
+On all three interfaces, firewall A uses the .254 address, while firewall B
+uses .253.
+The interfaces are configured as follows (firewall A unless otherwise
+.Pp
+.Bd -literal -offset indent
+inet 10.0.0.254 255.255.255.0 NONE
+.Ed
+.Pp
+.Bd -literal -offset indent
+inet 192.168.0.254 255.255.255.0 NONE
+.Ed
+.Pp
+.Bd -literal -offset indent
+inet 192.168.254.254 255.255.255.0 NONE
+.Ed
+.Pp
+.Bd -literal -offset indent
+inet 10.0.0.1 255.255.255.0 10.0.0.255 vhid 1 pass foo
+.Ed
+.Pp
+.Bd -literal -offset indent
+inet 192.168.0.1 255.255.255.0 192.168.0.255 vhid 2 pass bar
+.Ed
+.Pp
+.Bd -literal -offset indent
+up syncdev sis2
+.Ed
+.Pp
+.Xr pf 4
+must also be configured to allow
+.Nm
+and
+.Xr carp 4
+traffic through.
+The following should be added to the top of
+.Bd -literal -offset indent
+pass quick on { sis2 } proto pfsync
+pass on { sis0 sis1 } proto carp
+.Ed
+.Pp
+If it is preferable that one firewall handle the traffic,
+the
+.Ar advskew
+on the backup firewall's
+.Xr carp 4
+interfaces should be set to something higher than
+the primary's.
+For example, if firewall B is the backup, its
+.Pa /etc/hostname.carp1
+.Bd -literal -offset indent
+inet 192.168.0.1 255.255.255.0 192.168.0.255 vhid 2 pass bar \e
+ advskew 100
+.Ed
+.Pp
+The following must also be added to
+.Bd -literal -offset indent
+net.inet.carp.preempt=1
+.Ed
+.Sh SEE ALSO
+.Xr bpf 4 ,
+.Xr carp 4 ,
+.Xr enc 4 ,
+.Xr inet 4 ,
+.Xr inet6 4 ,
+.Xr ipsec 4 ,
+.Xr netintro 4 ,
+.Xr pf 4 ,
+.Xr hostname.if 5 ,
+.Xr pf.conf 5 ,
+.Xr protocols 5 ,
+.Xr ifconfig 8 ,
+.Xr ifstated 8 ,
+.Xr tcpdump 8
+.Sh HISTORY
+The
+.Nm
+device first appeared in
+.Ox 3.3 .
diff --git a/dist/tcpdump/interface.h b/dist/tcpdump/interface.h
index 3e5b5d9..8415d48 100644
--- a/dist/tcpdump/interface.h
+++ b/dist/tcpdump/interface.h
@@ -199,6 +199,8 @@ extern void dvmrp_print(const u_char *, u_int);
extern void egp_print(const u_char *, u_int);
extern u_int enc_if_print(const struct pcap_pkthdr *, const u_char *);
extern u_int pflog_if_print(const struct pcap_pkthdr *, const u_char *);
+extern u_int pfsync_if_print(const struct pcap_pkthdr *, const u_char *);
+extern void pfsync_ip_print(const u_char*, u_int, const u_char *);
extern u_int arcnet_if_print(const struct pcap_pkthdr *, const u_char *);
extern u_int arcnet_linux_if_print(const struct pcap_pkthdr *, const u_char *);
extern void ether_print(const u_char *, u_int, u_int);
diff --git a/dist/tcpdump/ipproto.c b/dist/tcpdump/ipproto.c
index b156f53..2acbfe8 100755
--- a/dist/tcpdump/ipproto.c
+++ b/dist/tcpdump/ipproto.c
@@ -62,6 +62,7 @@ struct tok ipproto_values[] = {
{ IPPROTO_PGM, "PGM" },
{ IPPROTO_SCTP, "SCTP" },
{ IPPROTO_MOBILITY, "Mobility" },
+ { IPPROTO_PFSYNC, "PFSYNC" },
space vs tab?
{ 0, NULL }
};
+ (unsigned long long int)pf_state_counter_from_pfsync(s->packets[0]),
+ (unsigned long long int)pf_state_counter_from_pfsync(s->packets[1]),
+ (unsigned long long int)pf_state_counter_from_pfsync(s->bytes[0]),
+ (unsigned long long int)pf_state_counter_from_pfsync(s->bytes[1]));
we stopped usint long int a long time ago. s/ int//
+ if (s->anchor != -1)
+ printf(", anchor %u", s->anchor);
+ for (i = 1, u = (void *)((char *)hdr + PFSYNC_HDRLEN);
+ i <= hdr->count && i * sizeof(*u) <= len; i++, u++) {
+ bcopy(&u->id, &id, sizeof(id));
no bcopy in new code.
+ printf("\n\tid: %016llx creatorid: %08x",
+ be64toh(id), ntohl(u->creatorid));
these be64toh with %llx will produce warnings in _LP64.
+ if (vflag > 1)
+ printf(" updates: %d", u->updates);
+ }
+ break;
+ for (i = 1, d = (void *)((char *)hdr + PFSYNC_HDRLEN);
+ i <= hdr->count && i * sizeof(*d) <= len; i++, d++) {
+ bcopy(&d->id, &id, sizeof(id));
again
+ printf("\n\tid: %016llx creatorid: %08x",
+ be64toh(id), ntohl(d->creatorid));
again
+ }
+ break;
+ for (i = 1, r = (void *)((char *)hdr + PFSYNC_HDRLEN);
+ i <= hdr->count && i * sizeof(*r) <= len; i++, r++) {
+ bcopy(&r->id, &id, sizeof(id));
+ printf("\n\tid: %016llx creatorid: %08x",
+ be64toh(id), ntohl(r->creatorid));
again
+ }
+ break;
+ if (sizeof(*b) <= len) {
+ b = (void *)((char *)hdr + PFSYNC_HDRLEN);
+ printf("\n\tcreatorid: %08x", htonl(b->creatorid));
+ sec = b->endtime % 60;
+ b->endtime /= 60;
+ min = b->endtime % 60;
+ b->endtime /= 60;
+ printf(" age %.2u:%.2u:%.2u", b->endtime, min, sec);
+ switch (b->status) {
+ printf(" status: start");
+ break;
+ printf(" status: end");
+ break;
+ printf(" status: ?");
+ break;
+ }
+ }
+ break;
+ for (i = 1, t = (void *)((char *)hdr + PFSYNC_HDRLEN);
+ i <= hdr->count && i * sizeof(*t) <= len; i++, t++)
+ printf("\n\tspi: %08x rpl: %u cur_bytes: %llu",
+ htonl(t->spi), htonl(t->rpl),
+ be64toh(t->cur_bytes));
again
+ /* XXX add dst and sproto? */
+ break;
+ break;
+ }
+}
diff --git a/dist/tcpdump/tcpdump.c b/dist/tcpdump/tcpdump.c
index 3915b30..6e3955d 100644
--- a/dist/tcpdump/tcpdump.c
+++ b/dist/tcpdump/tcpdump.c
@@ -200,6 +200,9 @@ static struct printer printers[] = {
#ifdef DLT_PFLOG
{ pflog_if_print, DLT_PFLOG },
#endif
+#ifdef DLT_PFSYNC
+ { pfsync_if_print, DLT_PFSYNC },
+#endif
#ifdef DLT_FR
{ fr_if_print, DLT_FR },
#endif
diff --git a/distrib/sets/lists/man/mi b/distrib/sets/lists/man/mi
index afe5fe3..9d5054c 100644
--- a/distrib/sets/lists/man/mi
+++ b/distrib/sets/lists/man/mi
@@ -1281,6 +1281,7 @@
./usr/share/man/cat4/pdcsata.0 man-sys-catman .cat
./usr/share/man/cat4/pf.0 man-pf-catman pf,.cat
./usr/share/man/cat4/pflog.0 man-pf-catman pf,.cat
+./usr/share/man/cat4/pfsync.0 man-pf-catman pf,.cat
./usr/share/man/cat4/phy.0 man-sys-catman .cat
./usr/share/man/cat4/piixide.0 man-sys-catman .cat
./usr/share/man/cat4/piixpcib.0 man-sys-catman .cat
@@ -3842,6 +3843,7 @@
./usr/share/man/html4/pdcsata.html man-sys-htmlman html
./usr/share/man/html4/pf.html man-pf-htmlman pf,html
./usr/share/man/html4/pflog.html man-pf-htmlman pf,html
+./usr/share/man/html4/pfsync.html man-pf-htmlman pf,html
./usr/share/man/html4/phy.html man-sys-htmlman html
./usr/share/man/html4/piixide.html man-sys-htmlman html
./usr/share/man/html4/piixpcib.html man-sys-htmlman html
@@ -6281,6 +6283,7 @@
./usr/share/man/man4/pdcsata.4 man-sys-man .man
./usr/share/man/man4/pf.4 man-pf-man pf,.man
./usr/share/man/man4/pflog.4 man-pf-man pf,.man
+./usr/share/man/man4/pfsync.4 man-pf-man pf,.man
./usr/share/man/man4/phy.4 man-sys-man .man
./usr/share/man/man4/piixide.4 man-sys-man .man
./usr/share/man/man4/piixpcib.4 man-sys-man .man
diff --git a/etc/protocols b/etc/protocols
index 5ccff7e..19f7e0f 100644
--- a/etc/protocols
+++ b/etc/protocols
@@ -157,6 +157,7 @@ mobility 135 Mobility # Header
[RFC3775]
udplite 136 UDPLite # [RFC3828]
mpls-in-ip 137 MPLS-in-IP # [RFC4023]
# 138-252 Unassigned [IANA]
+pfsync 240 PFSYNC # PF Synchronization
use 253 Use # for experimentation and testing
[RFC3692]
use 254 Use # for experimentation and testing
[RFC3692]
# 255 Reserved [IANA]
diff --git a/sbin/ifconfig/Makefile.inc b/sbin/ifconfig/Makefile.inc
index aed9382..d24887a 100644
--- a/sbin/ifconfig/Makefile.inc
+++ b/sbin/ifconfig/Makefile.inc
@@ -19,3 +19,6 @@ SRCS+= parse.c
SRCS+= tunnel.c
SRCS+= util.c
SRCS+= vlan.c
+
+CPPFLAGS+=-I ${.CURDIR}/../../sys/dist/pf/
+SRCS+= pfsync.c
diff --git a/sbin/ifconfig/ifconfig.8 b/sbin/ifconfig/ifconfig.8
index 6bed57b..7e5f3cb 100644
--- a/sbin/ifconfig/ifconfig.8
+++ b/sbin/ifconfig/ifconfig.8
@@ -1,4 +1,4 @@
-.\" $NetBSD: ifconfig.8,v 1.100 2009/08/07 20:13:12 dyoung Exp $
+.\" $NetBSD: ifconfig.8,v 1.98 2009/07/02 18:43:47 dyoung Exp $
.\"
.\" Copyright (c) 1983, 1991, 1993
.\" The Regents of the University of California. All rights reserved.
@@ -723,6 +723,37 @@ support it.
.It Cm -tso6
Disable hardware-assisted TCP/IPv6 segmentation on interfaces that
support it.
+.It Cm maxupd Ar n
+If the driver is a
+.Xr pfsync 4
+pseudo-device, indicate the maximum number
+of updates for a single state which can be collapsed into one.
+This is an 8-bit number; the default value is 128.
+.It Cm syncdev Ar iface
+If the driver is a
+.Xr pfsync 4
+pseudo-device, use the specified interface
+to send and receive pfsync state synchronisation messages.
+.It Fl syncdev
+If the driver is a
+.Xr pfsync 4
+pseudo-device, stop sending pfsync state
+synchronisation messages over the network.
+.It Cm syncpeer Ar peer_address
+If the driver is a
+.Xr pfsync 4
+pseudo-device, make the pfsync link point-to-point rather than using
+multicast to broadcast the state synchronisation messages.
+The peer_address is the IP address of the other host taking part in
+the pfsync cluster.
+With this option,
+.Xr pfsync 4
+traffic can be protected using
+.Xr ipsec 4 .
+.It Fl syncpeer
+If the driver is a
+.Xr pfsync 4
+pseudo-device, broadcast the packets using multicast.
.El
.Pp
.Nm
.Pp
.Ic ifconfig sip0 link 00:11:22:33:44:55
.Pp
.Pp
.Ic ifconfig sip0 link 00:11:22:33:44:55 active
.Sh DIAGNOSTICS
@@ -848,6 +879,7 @@ tried to alter an interface's configuration.
.Xr carp 4 ,
.Xr ifmedia 4 ,
.Xr netintro 4 ,
+.Xr pfsync 4 ,
.Xr vlan 4 ,
.Xr ifconfig.if 5 ,
.\" .Xr eon 5 ,
diff --git a/sbin/ifconfig/pfsync.c b/sbin/ifconfig/pfsync.c
new file mode 100644
index 0000000..8f534c8
--- /dev/null
+++ b/sbin/ifconfig/pfsync.c
@@ -0,0 +1,202 @@
no copyright?
+#include <sys/cdefs.h>
+#ifndef lint
+__RCSID("$NetBSD:$");
+#endif /* not lint */
+
+
+static status_func_t status;
+static usage_func_t usage;
+static cmdloop_branch_t branch;
+
+static void pfsync_constructor(void) __attribute__((constructor));
should we depend on __constructor__? Or add explicit call
+static void pfsync_status(prop_dictionary_t, prop_dictionary_t);
+static int setpfsync_maxupd(prop_dictionary_t, prop_dictionary_t);
+static int setpfsync_peer(prop_dictionary_t, prop_dictionary_t);
+static int setpfsyncdev(prop_dictionary_t, prop_dictionary_t);
+
+struct pinteger parse_maxupd = PINTEGER_INITIALIZER1(&parse_maxupd, "maxupd",
+ 0, 255, 10, setpfsync_maxupd, "maxupd", &command_root.pb_parser);
+
+struct piface pfsyncdev = PIFACE_INITIALIZER(&pfsyncdev, "syncdev",
setpfsyncdev,
+ "syncdev", &command_root.pb_parser);
+
+struct paddr parse_sync_peer = PADDR_INITIALIZER(&parse_sync_peer, "syncpeer",
+ setpfsync_peer, "syncpeer", NULL, NULL, NULL, &command_root.pb_parser);
+
+static const struct kwinst pfsynckw[] = {
+ {.k_word = "maxupd", .k_nextparser = &parse_maxupd.pi_parser}
+ , {.k_word = "syncdev", .k_nextparser = &pfsyncdev.pif_parser}
+ , {.k_word = "-syncdev", .k_key = "syncdev", .k_type = KW_T_STR,
+ .k_str = "", .k_exec = setpfsyncdev,
+ .k_nextparser = &command_root.pb_parser}
+ , {.k_word = "syncpeer", .k_nextparser = &parse_sync_peer.pa_parser}
+ , {.k_word = "-syncpeer", .k_key = "syncpeer", .k_type = KW_T_STR,
+ .k_str = "", .k_exec = setpfsync_peer,
+ .k_nextparser = &command_root.pb_parser}
+};
indent looks ugly
+
+struct pkw pfsync = PKW_INITIALIZER(&pfsync, "pfsync", NULL, NULL,
+
+ s = (const struct sockaddr_in*) &peerpfx->pfx_addr;
not KNF.
+
+ memcpy(&pfsyncr.pfsyncr_syncpeer.s_addr, &s->sin_addr,
+ MIN(sizeof(pfsyncr.pfsyncr_syncpeer.s_addr),
+ peerpfx->pfx_addr.sa_len));
not KNF
+ } else {
+ memset(&pfsyncr.pfsyncr_syncpeer.s_addr, 0,
+ sizeof(pfsyncr.pfsyncr_syncpeer.s_addr));
not KNF
+ }
+
+ pfsync_set(env, &pfsyncr);
+
diff --git a/sys/dist/pf/net/if_pfsync.c b/sys/dist/pf/net/if_pfsync.c
new file mode 100644
index 0000000..8a53013
--- /dev/null
+++ b/sys/dist/pf/net/if_pfsync.c
@@ -0,0 +1,1828 @@
+/* $NetBSD: if_pfsync.c Exp $ */
+/* $OpenBSD: if_pfsync.c,v 1.83 2007/06/26 14:44:12 mcbride Exp $ */
+
+/*
+ * Copyright (c) 2002 Michael Shalayeff
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE AUTHOR OR HIS RELATIVES BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF MIND, USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__KERNEL_RCSID(0, "$NetBSD: if_pfsync.c Exp $");
+
+#ifdef _KERNEL_OPT
+#include "opt_inet.h"
+#include "opt_inet6.h"
+#endif
+
+#include <sys/param.h>
+#include <sys/proc.h>
+#include <sys/systm.h>
+#include <sys/time.h>
+#include <sys/mbuf.h>
+#include <sys/socket.h>
+#include <sys/ioctl.h>
+#include <sys/callout.h>
+#include <sys/kernel.h>
+
+#include <net/if.h>
+#include <net/if_types.h>
+#include <net/route.h>
+#include <net/bpf.h>
+#include <netinet/in.h>
+#ifndef __NetBSD__
+#include <netinet/if_ether.h>
+#else
+#include <net/if_ether.h>
+#endif /* __NetBSD__ */
+#include <netinet/tcp.h>
+#include <netinet/tcp_seq.h>
+
+#ifdef INET
+#include <netinet/in_systm.h>
+#include <netinet/in_var.h>
+#include <netinet/ip.h>
+#include <netinet/ip_var.h>
+#endif
+
+#ifdef INET6
+#include <netinet6/nd6.h>
+#endif /* INET6 */
+
+#include "carp.h"
+#if NCARP > 0
+extern int carp_suppress_preempt;
+#endif
+
+#include <net/pfvar.h>
+#include <net/if_pfsync.h>
+
+#ifdef __NetBSD__
+#include <sys/conf.h>
+#include <sys/lwp.h>
+#include <sys/kauth.h>
+#include <sys/sysctl.h>
+
+#include <net/net_stats.h>
+
+percpu_t *pfsyncstat_percpu;
+
+#define PFSYNC_STATINC(x) _NET_STATINC(pfsyncstat_percpu, x)
+#endif /* __NetBSD__ */
+
+#include "bpfilter.h"
+#include "pfsync.h"
+
+#define PFSYNC_MINMTU \
+ (sizeof(struct pfsync_header) + sizeof(struct pf_state))
+
+#ifdef PFSYNCDEBUG
+#define DPRINTF(x) do { if (pfsyncdebug) printf x ; } while (0)
+int pfsyncdebug;
+#else
+#define DPRINTF(x)
+#endif
+
+extern int ifqmaxlen; /* XXX */
+
+struct pfsync_softc *pfsyncif = NULL;
+
+void pfsyncattach(int);
+int pfsync_clone_create(struct if_clone *, int);
+int pfsync_clone_destroy(struct ifnet *);
+void pfsync_setmtu(struct pfsync_softc *, int);
+int pfsync_alloc_scrub_memory(struct pfsync_state_peer *,
+ struct pf_state_peer *);
+int pfsync_insert_net_state(struct pfsync_state *, u_int8_t);
+void pfsync_update_net_tdb(struct pfsync_tdb *);
+int pfsyncoutput(struct ifnet *, struct mbuf *, const struct sockaddr *,
+ struct rtentry *);
+int pfsyncioctl(struct ifnet *, u_long, void*);
+void pfsyncstart(struct ifnet *);
+
+struct mbuf *pfsync_get_mbuf(struct pfsync_softc *, u_int8_t, void **);
+int pfsync_request_update(struct pfsync_state_upd *, struct in_addr *);
+int pfsync_sendout(struct pfsync_softc *);
+int pfsync_tdb_sendout(struct pfsync_softc *);
+int pfsync_sendout_mbuf(struct pfsync_softc *, struct mbuf *);
+void pfsync_timeout(void *);
+void pfsync_tdb_timeout(void *);
+void pfsync_send_bus(struct pfsync_softc *, u_int8_t);
+void pfsync_bulk_update(void *);
+void pfsync_bulkfail(void *);
+
+int pfsync_sync_ok;
+
+struct if_clone pfsync_cloner =
+ IF_CLONE_INITIALIZER("pfsync", pfsync_clone_create, pfsync_clone_destroy);
+
+void
+pfsyncattach(int npfsync)
+{
+ if_clone_attach(&pfsync_cloner);
+
+ pfsyncstat_percpu = percpu_alloc(sizeof(uint64_t) * PFSYNC_NSTATS);
+}
+
+int
+pfsync_clone_create(struct if_clone *ifc, int unit)
+{
+ struct ifnet *ifp;
+
+ if (unit != 0)
+ return (EINVAL);
+
+ pfsync_sync_ok = 1;
+ if ((pfsyncif = malloc(sizeof(*pfsyncif), M_DEVBUF, M_NOWAIT)) == NULL)
+ return (ENOMEM);
+ bzero(pfsyncif, sizeof(*pfsyncif));
no bzero in new code
+ pfsyncif->sc_mbuf = NULL;
+ pfsyncif->sc_mbuf_net = NULL;
+ pfsyncif->sc_mbuf_tdb = NULL;
+ pfsyncif->sc_statep.s = NULL;
+ pfsyncif->sc_statep_net.s = NULL;
+ pfsyncif->sc_statep_tdb.t = NULL;
+ pfsyncif->sc_maxupdates = 128;
+ pfsyncif->sc_sync_peer.s_addr = INADDR_PFSYNC_GROUP;
+ pfsyncif->sc_sendaddr.s_addr = INADDR_PFSYNC_GROUP;
+ pfsyncif->sc_ureq_received = 0;
+ pfsyncif->sc_ureq_sent = 0;
+ pfsyncif->sc_bulk_send_next = NULL;
+ pfsyncif->sc_bulk_terminator = NULL;
+ ifp = &pfsyncif->sc_if;
+ snprintf(ifp->if_xname, sizeof ifp->if_xname, "pfsync%d", unit);
+ ifp->if_softc = pfsyncif;
+ ifp->if_ioctl = pfsyncioctl;
+ ifp->if_output = pfsyncoutput;
+ ifp->if_start = pfsyncstart;
+ ifp->if_type = IFT_PFSYNC;
+ ifp->if_snd.ifq_maxlen = ifqmaxlen;
+ ifp->if_hdrlen = PFSYNC_HDRLEN;
+ pfsync_setmtu(pfsyncif, ETHERMTU);
+
+ callout_init(&pfsyncif->sc_tmo, 0);
+ callout_init(&pfsyncif->sc_tdb_tmo, 0);
+ callout_init(&pfsyncif->sc_bulk_tmo, 0);
+ callout_init(&pfsyncif->sc_bulkfail_tmo, 0);
+ callout_setfunc(&pfsyncif->sc_tmo, pfsync_timeout, pfsyncif);
+ callout_setfunc(&pfsyncif->sc_tdb_tmo, pfsync_tdb_timeout, pfsyncif);
+ callout_setfunc(&pfsyncif->sc_bulk_tmo, pfsync_bulk_update, pfsyncif);
+ callout_setfunc(&pfsyncif->sc_bulkfail_tmo, pfsync_bulkfail, pfsyncif);
+
+ if_attach(ifp);
+ if_alloc_sadl(ifp);
+
+#if NBPFILTER > 0
+ bpfattach(&pfsyncif->sc_if, DLT_PFSYNC, PFSYNC_HDRLEN);
+#endif
+
+ return (0);
+}
+
+int
+pfsync_clone_destroy(struct ifnet *ifp)
+{
+#if NBPFILTER > 0
+ bpfdetach(ifp);
+#endif
+ if_detach(ifp);
+ free(pfsyncif, M_DEVBUF);
+ pfsyncif = NULL;
+ return (0);
+}
+
+/*
+ * Start output on the pfsync interface.
+ */
+void
+pfsyncstart(struct ifnet *ifp)
+{
+ struct mbuf *m;
+ int s;
+
+ for (;;) {
+ s = splnet();
+ IF_DROP(&ifp->if_snd);
+ IF_DEQUEUE(&ifp->if_snd, m);
+ splx(s);
+
+ if (m == NULL)
+ return;
+ else
+ m_freem(m);
+ }
+}
+
+int
+pfsync_alloc_scrub_memory(struct pfsync_state_peer *s,
+ struct pf_state_peer *d)
+{
+ if (s->scrub.scrub_flag && d->scrub == NULL) {
+ d->scrub = pool_get(&pf_state_scrub_pl, PR_NOWAIT);
+ if (d->scrub == NULL)
+ return (ENOMEM);
+ bzero(d->scrub, sizeof(*d->scrub));
no bzero in new code.
+ }
+
+ return (0);
+}
+
+int
+pfsync_insert_net_state(struct pfsync_state *sp, u_int8_t chksum_flag)
+{
+ struct pf_state *st = NULL;
+ struct pf_state_key *sk = NULL;
+ struct pf_rule *r = NULL;
+ struct pfi_kif *kif;
+
+ if (sp->creatorid == 0 && pf_status.debug >= PF_DEBUG_MISC) {
+ printf("pfsync_insert_net_state: invalid creator id:"
+ " %08x\n", ntohl(sp->creatorid));
+ return (EINVAL);
+ }
+
+ kif = pfi_kif_get(sp->ifname);
+ if (kif == NULL) {
+ if (pf_status.debug >= PF_DEBUG_MISC)
+ printf("pfsync_insert_net_state: "
+ "unknown interface: %s\n", sp->ifname);
+ /* skip this state */
+ return (0);
+ }
+
+ /*
+ * If the ruleset checksums match, it's safe to associate the state
+ * with the rule of that number.
+ */
+ if (sp->rule != htonl(-1) && sp->anchor == htonl(-1) && chksum_flag &&
+ ntohl(sp->rule) <
+ pf_main_ruleset.rules[PF_RULESET_FILTER].active.rcount)
+ r = pf_main_ruleset.rules[
+ PF_RULESET_FILTER].active.ptr_array[ntohl(sp->rule)];
+ else
+ r = &pf_default_rule;
+
+ if (!r->max_states || r->states < r->max_states)
+ st = pool_get(&pf_state_pl, PR_NOWAIT);
+ if (st == NULL) {
+ pfi_kif_unref(kif, PFI_KIF_REF_NONE);
+ return (ENOMEM);
+ }
+ bzero(st, sizeof(*st));
no bzero in new code
+
+ if ((sk = pf_alloc_state_key(st)) == NULL) {
+ pool_put(&pf_state_pl, st);
+ return (error);
+ bzero(&pfsyncr, sizeof(pfsyncr));
no bzero in new code.
+ if (sc->sc_sync_ifp)
+ i = 255;
+ sp = sc->sc_statep.s++;
+ sc->sc_mbuf->m_pkthdr.len =
+ sc->sc_mbuf->m_len += sizeof(struct pfsync_state);
+ h->count++;
+ bzero(sp, sizeof(*sp));
+
+ bcopy(&st->id, sp->id, sizeof(sp->id));
no bzero bcopy in new code.
+ sp->creatorid = st->creatorid;
+
+ strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname));
+ pf_state_host_hton(&sk->lan, &sp->lan);
+ pf_state_host_hton(&sk->gwy, &sp->gwy);
+ pf_state_host_hton(&sk->ext, &sp->ext);
+
again
+ bcopy(&st->rt_addr, &sp->rt_addr, sizeof(sp->rt_addr));
+
+ sp->creation = htonl(secs - st->creation);
+ pf_state_counter_hton(st->packets[0], sp->packets[0]);
+ pf_state_counter_hton(st->packets[1], sp->packets[1]);
+ pf_state_counter_hton(st->bytes[0], sp->bytes[0]);
+ pf_state_counter_hton(st->bytes[1], sp->bytes[1]);
+ if ((r = st->rule.ptr) == NULL)
+ sp->rule = htonl(-1);
+ else
+ sp->rule = htonl(r->nr);
+ if ((r = st->anchor.ptr) == NULL)
+ sp->anchor = htonl(-1);
+ else
+ sp->anchor = htonl(r->nr);
+ sp->af = sk->af;
+ sp->proto = sk->proto;
+ sp->direction = sk->direction;
+ sp->log = st->log;
+ sp->allow_opts = st->allow_opts;
+ sp->timeout = st->timeout;
+
+ if (flags & PFSYNC_FLAG_STALE)
+ sp->sync_flags |= PFSTATE_STALE;
+ }
+
+ sc->sc_mbuf_net->m_len += sizeof(*up);
+ up = sc->sc_statep_net.u++;
+
+ bzero(up, sizeof(*up));
+ bcopy(&st->id, up->id, sizeof(up->id));
again
+ up->creatorid = st->creatorid;
+ }
+ up->timeout = st->timeout;
+ up->expire = sp->expire;
+ up->src = sp->src;
+ up->dst = sp->dst;
+ break;
+ sc->sc_mbuf_net->m_pkthdr.len =
+ sc->sc_mbuf_net->m_len += sizeof(*dp);
+ dp = sc->sc_statep_net.d++;
+ h_net->count++;
+
+ bzero(dp, sizeof(*dp));
+ bcopy(&st->id, dp->id, sizeof(dp->id));
again
+ dp->creatorid = st->creatorid;
+ break;
+ }
+ }
+
+ if (h->count == sc->sc_maxcount ||
+ (sc->sc_maxupdates && (sp->updates >= sc->sc_maxupdates)))
+ ret = pfsync_sendout(sc);
+
+ splx(s);
+ return (ret);
+}
+
+/* This must be called in splnet() */
+int
+pfsync_request_update(struct pfsync_state_upd *up, struct in_addr *src)
+{
+ struct ifnet *ifp = NULL;
+ struct pfsync_header *h;
+ struct pfsync_softc *sc = pfsyncif;
+ struct pfsync_state_upd_req *rup;
+ int ret = 0;
+
+ if (sc == NULL)
+ return (0);
+
+ ifp = &sc->sc_if;
+ if (sc->sc_mbuf == NULL) {
+ if ((sc->sc_mbuf = pfsync_get_mbuf(sc, PFSYNC_ACT_UREQ,
+ (void *)&sc->sc_statep.s)) == NULL)
+ return (ENOMEM);
+ h = mtod(sc->sc_mbuf, struct pfsync_header *);
+ } else {
+ h = mtod(sc->sc_mbuf, struct pfsync_header *);
+ if (h->action != PFSYNC_ACT_UREQ) {
+ pfsync_sendout(sc);
+ if ((sc->sc_mbuf = pfsync_get_mbuf(sc, PFSYNC_ACT_UREQ,
+ (void *)&sc->sc_statep.s)) == NULL)
+ return (ENOMEM);
+ h = mtod(sc->sc_mbuf, struct pfsync_header *);
+ }
+ }
+
+ if (src != NULL)
+ sc->sc_sendaddr = *src;
+ sc->sc_mbuf->m_pkthdr.len = sc->sc_mbuf->m_len += sizeof(*rup);
+ h->count++;
+ rup = sc->sc_statep.r++;
+ bzero(rup, sizeof(*rup));
again
+ if (up != NULL) {
+ bcopy(up->id, rup->id, sizeof(rup->id));
+ rup->creatorid = up->creatorid;
+ }
+
+ if (h->count == sc->sc_maxcount)
+ ret = pfsync_sendout(sc);
+
+ return (ret);
+}
+
+int
+pfsync_clear_states(u_int32_t creatorid, char *ifname)
+{
+ struct ifnet *ifp = NULL;
+ struct pfsync_softc *sc = pfsyncif;
+ struct pfsync_state_clr *cp;
+ int s, ret;
+
+ if (sc == NULL)
+ return (0);
+
+ ifp = &sc->sc_if;
+ s = splnet();
+ if (sc->sc_mbuf != NULL)
+ pfsync_sendout(sc);
+ if ((sc->sc_mbuf = pfsync_get_mbuf(sc, PFSYNC_ACT_CLR,
+ (void *)&sc->sc_statep.c)) == NULL) {
+ splx(s);
+ return (ENOMEM);
+ }
+ sc->sc_mbuf->m_pkthdr.len = sc->sc_mbuf->m_len += sizeof(*cp);
+ cp = sc->sc_statep.c;
+ cp->creatorid = creatorid;
+ if (ifname != NULL)
+ strlcpy(cp->ifname, ifname, IFNAMSIZ);
+
+ ret = (pfsync_sendout(sc));
+ splx(s);
+ return (ret);
+}
+
+void
+pfsync_timeout(void *v)
+{
+ struct pfsync_softc *sc = v;
+ int s;
+
+ s = splnet();
+ pfsync_sendout(sc);
+ splx(s);
+}
+
+void
+pfsync_tdb_timeout(void *v)
+{
+ struct pfsync_softc *sc = v;
+ int s;
+
+ s = splnet();
+ pfsync_tdb_sendout(sc);
+ splx(s);
+}
+
+/* This must be called in splnet() */
+void
+pfsync_send_bus(struct pfsync_softc *sc, u_int8_t status)
+{
+ struct pfsync_state_bus *bus;
+
+ if (sc->sc_mbuf != NULL)
+ pfsync_sendout(sc);
+
+ if (pfsync_sync_ok &&
+ (sc->sc_mbuf = pfsync_get_mbuf(sc, PFSYNC_ACT_BUS,
+ (void *)&sc->sc_statep.b)) != NULL) {
+ sc->sc_mbuf->m_pkthdr.len = sc->sc_mbuf->m_len += sizeof(*bus);
+ bus = sc->sc_statep.b;
+ bus->creatorid = pf_status.hostid;
+ bus->status = status;
+ bus->endtime = htonl(time_uptime - sc->sc_ureq_received);
+ pfsync_sendout(sc);
+ }
+}
+
+void
+pfsync_bulk_update(void *v)
+{
+ struct pfsync_softc *sc = v;
+ int s, i = 0;
+ struct pf_state *state;
+
+ s = splnet();
+ if (sc->sc_mbuf != NULL)
+ pfsync_sendout(sc);
+
+ /*
+ * Grab at most PFSYNC_BULKPACKETS worth of states which have not
+ * been sent since the latest request was made.
+ */
+ state = sc->sc_bulk_send_next;
+ if (state)
+ do {
+ /* send state update if syncable and not already sent */
+ if (!state->sync_flags
+ && state->timeout < PFTM_MAX
+ && state->pfsync_time <= sc->sc_ureq_received) {
+ pfsync_pack_state(PFSYNC_ACT_UPD, state, 0);
+ i++;
+ }
+
+ /* figure next state to send */
+ state = TAILQ_NEXT(state, entry_list);
+
+ /* wrap to start of list if we hit the end */
+ if (!state)
+ state = TAILQ_FIRST(&state_list);
+ } while (i < sc->sc_maxcount * PFSYNC_BULKPACKETS &&
+ state != sc->sc_bulk_terminator);
+
+ if (!state || state == sc->sc_bulk_terminator) {
+ /* we're done */
+ pfsync_send_bus(sc, PFSYNC_BUS_END);
+ sc->sc_ureq_received = 0;
+ sc->sc_bulk_send_next = NULL;
+ sc->sc_bulk_terminator = NULL;
+ callout_stop(&sc->sc_bulk_tmo);
+ if (pf_status.debug >= PF_DEBUG_MISC)
+ printf("pfsync: bulk update complete\n");
+ } else {
+ /* look again for more in a bit */
+ callout_schedule(&sc->sc_bulk_tmo, 1);
+ sc->sc_bulk_send_next = state;
+ }
+ if (sc->sc_mbuf != NULL)
+ pfsync_sendout(sc);
+ splx(s);
+}
+
+void
+pfsync_bulkfail(void *v)
+{
+ struct pfsync_softc *sc = v;
+ int s, error;
+
+ if (sc->sc_bulk_tries++ < PFSYNC_MAX_BULKTRIES) {
+ /* Try again in a bit */
+ callout_schedule(&sc->sc_bulkfail_tmo, 5 * hz);
+ s = splnet();
+ error = pfsync_request_update(NULL, NULL);
+ if (error == ENOMEM) {
+ if (pf_status.debug >= PF_DEBUG_MISC)
+ printf("pfsync: cannot allocate mbufs for "
+ "bulk update\n");
+ } else
+ pfsync_sendout(sc);
+ splx(s);
+ } else {
+ /* Pretend like the transfer was ok */
+ sc->sc_ureq_sent = 0;
+ sc->sc_bulk_tries = 0;
+#if NCARP > 0
+ if (!pfsync_sync_ok)
+ carp_suppress_preempt --;
+#endif
+ pfsync_sync_ok = 1;
+ if (pf_status.debug >= PF_DEBUG_MISC)
+ printf("pfsync: failed to receive "
+ "bulk update status\n");
+ callout_stop(&sc->sc_bulkfail_tmo);
+ }
+}
+
+/* This must be called in splnet() */
+int
+pfsync_sendout(struct pfsync_softc *sc)
+{
+#if NBPFILTER > 0
+ struct ifnet *ifp = &sc->sc_if;
+#endif
+ struct mbuf *m;
+
+ callout_stop(&sc->sc_tmo);
+
+ if (sc->sc_mbuf == NULL)
+ return (0);
+ m = sc->sc_mbuf;
+ sc->sc_mbuf = NULL;
+ sc->sc_statep.s = NULL;
+
+#if NBPFILTER > 0
+ if (ifp->if_bpf)
+ bpf_mtap(ifp->if_bpf, m);
+#endif
+
+ if (sc->sc_mbuf_net) {
+ m_freem(m);
+ m = sc->sc_mbuf_net;
+ sc->sc_mbuf_net = NULL;
+ sc->sc_statep_net.s = NULL;
+ }
+
+ return pfsync_sendout_mbuf(sc, m);
+}
+
+int
+pfsync_tdb_sendout(struct pfsync_softc *sc)
+{
+#if NBPFILTER > 0
+ struct ifnet *ifp = &sc->sc_if;
+#endif
+ struct mbuf *m;
+
+ callout_stop(&sc->sc_tdb_tmo);
+
+ if (sc->sc_mbuf_tdb == NULL)
+ return (0);
+ m = sc->sc_mbuf_tdb;
+ sc->sc_mbuf_tdb = NULL;
+ sc->sc_statep_tdb.t = NULL;
+
+#if NBPFILTER > 0
+ if (ifp->if_bpf)
+ bpf_mtap(ifp->if_bpf, m);
+#endif
+
+ return pfsync_sendout_mbuf(sc, m);
+}
+
+int
+pfsync_sendout_mbuf(struct pfsync_softc *sc, struct mbuf *m)
+{
+ struct sockaddr sa;
+ struct ip *ip;
+
+ if (sc->sc_sync_ifp ||
+ sc->sc_sync_peer.s_addr != INADDR_PFSYNC_GROUP) {
+ M_PREPEND(m, sizeof(struct ip), M_DONTWAIT);
+ if (m == NULL) {
+ PFSYNC_STATINC(PFSYNC_STAT_ONOMEM);
+ return (0);
+ }
+ ip = mtod(m, struct ip *);
+ ip->ip_v = IPVERSION;
+ ip->ip_hl = sizeof(*ip) >> 2;
+ ip->ip_tos = IPTOS_LOWDELAY;
+ ip->ip_len = htons(m->m_pkthdr.len);
+ ip->ip_id = htons(ip_randomid(0));
+ ip->ip_off = htons(IP_DF);
+ ip->ip_ttl = PFSYNC_DFLTTL;
+ ip->ip_p = IPPROTO_PFSYNC;
+ ip->ip_sum = 0;
+
+ bzero(&sa, sizeof(sa));
again
+ ip->ip_src.s_addr = INADDR_ANY;
+
+ if (sc->sc_sendaddr.s_addr == INADDR_PFSYNC_GROUP)
+ m->m_flags |= M_MCAST;
+ ip->ip_dst = sc->sc_sendaddr;
+ sc->sc_sendaddr.s_addr = sc->sc_sync_peer.s_addr;
+
+ PFSYNC_STATINC(PFSYNC_STAT_OPACKETS);
+
+ if (ip_output(m, NULL, NULL, IP_RAWOUTPUT, &sc->sc_imo, NULL)) {
+ PFSYNC_STATINC(PFSYNC_STAT_OERRORS);
+ }
+ } else
+ m_freem(m);
+
+ return (0);
+}
+
+#ifdef IPSEC
+/* Update an in-kernel tdb. Silently fail if no tdb is found. */
+void
+pfsync_update_net_tdb(struct pfsync_tdb *pt)
+{
+ struct tdb *tdb;
+ int s;
+
+ /* check for invalid values */
+ if (ntohl(pt->spi) <= SPI_RESERVED_MAX ||
+ (pt->dst.sa.sa_family != AF_INET &&
+ pt->dst.sa.sa_family != AF_INET6))
+ goto bad;
+
+ s = spltdb();
+ tdb = gettdb(pt->spi, &pt->dst, pt->sproto);
+ if (tdb) {
+ pt->rpl = ntohl(pt->rpl);
+ pt->cur_bytes = betoh64(pt->cur_bytes);
+
+ /* Neither replay nor byte counter should ever decrease. */
+ if (pt->rpl < tdb->tdb_rpl ||
+ pt->cur_bytes < tdb->tdb_cur_bytes) {
+ splx(s);
+ goto bad;
+ }
+
+ tdb->tdb_rpl = pt->rpl;
+ tdb->tdb_cur_bytes = pt->cur_bytes;
+ }
+ splx(s);
+ return;
+
+ if (pf_status.debug >= PF_DEBUG_MISC)
+ printf("pfsync_insert: PFSYNC_ACT_TDB_UPD: "
+ "invalid value\n");
+ PFSYNC_STATINC(PFSYNC_STAT_BADSTATE);
+ return;
+}
+
+/* One of our local tdbs have been updated, need to sync rpl with others */
+int
+pfsync_update_tdb(struct tdb *tdb, int output)
+{
+ struct ifnet *ifp = NULL;
+ struct pfsync_softc *sc = pfsyncif;
+ struct pfsync_header *h;
+ struct pfsync_tdb *pt = NULL;
+ int s, i, ret;
+
+ if (sc == NULL)
+ return (0);
+
+ ifp = &sc->sc_if;
+ if (ifp->if_bpf == NULL && sc->sc_sync_ifp == NULL &&
+ sc->sc_sync_peer.s_addr == INADDR_PFSYNC_GROUP) {
+ /* Don't leave any stale pfsync packets hanging around. */
+ if (sc->sc_mbuf_tdb != NULL) {
+ m_freem(sc->sc_mbuf_tdb);
+ sc->sc_mbuf_tdb = NULL;
+ sc->sc_statep_tdb.t = NULL;
+ }
+ return (0);
+ }
+
+ s = splnet();
+ if (sc->sc_mbuf_tdb == NULL) {
+ if ((sc->sc_mbuf_tdb = pfsync_get_mbuf(sc, PFSYNC_ACT_TDB_UPD,
+ (void *)&sc->sc_statep_tdb.t)) == NULL) {
+ splx(s);
+ return (ENOMEM);
+ }
+ h = mtod(sc->sc_mbuf_tdb, struct pfsync_header *);
+ } else {
+ h = mtod(sc->sc_mbuf_tdb, struct pfsync_header *);
+ if (h->action != PFSYNC_ACT_TDB_UPD) {
+ /*
+ * XXX will never happen as long as there's
+ * only one "TDB action".
+ */
+ pfsync_tdb_sendout(sc);
+ sc->sc_mbuf_tdb = pfsync_get_mbuf(sc,
+ PFSYNC_ACT_TDB_UPD, (void *)&sc->sc_statep_tdb.t);
+ if (sc->sc_mbuf_tdb == NULL) {
+ splx(s);
+ return (ENOMEM);
+ }
+ h = mtod(sc->sc_mbuf_tdb, struct pfsync_header *);
+ } else if (sc->sc_maxupdates) {
+ /*
+ * If it's an update, look in the packet to see if
+ * we already have an update for the state.
+ */
+ struct pfsync_tdb *u =
+ (void *)((char *)h + PFSYNC_HDRLEN);
+
+ for (i = 0; !pt && i < h->count; i++) {
+ if (tdb->tdb_spi == u->spi &&
+ tdb->tdb_sproto == u->sproto &&
+ !bcmp(&tdb->tdb_dst, &u->dst,
+ SA_LEN(&u->dst.sa))) {
+ pt = u;
+ pt->updates++;
+ }
+ u++;
+ }
+ }
+ }
+
+ if (pt == NULL) {
+ /* not a "duplicate" update */
+ pt = sc->sc_statep_tdb.t++;
+ sc->sc_mbuf_tdb->m_pkthdr.len =
+ sc->sc_mbuf_tdb->m_len += sizeof(struct pfsync_tdb);
+ h->count++;
+ bzero(pt, sizeof(*pt));
+
+ pt->spi = tdb->tdb_spi;
+ memcpy(&pt->dst, &tdb->tdb_dst, sizeof pt->dst);
+ pt->sproto = tdb->tdb_sproto;
+ }
+
+ /*
+ * When a failover happens, the master's rpl is probably above
+ * what we see here (we may be up to a second late), so
+ * increase it a bit for outbound tdbs to manage most such
+ * situations.
+ *
+ * For now, just add an offset that is likely to be larger
+ * than the number of packets we can see in one second. The RFC
+ * just says the next packet must have a higher seq value.
+ *
+ * XXX What is a good algorithm for this? We could use
+ * a rate-determined increase, but to know it, we would have
+ * to extend struct tdb.
+ * XXX pt->rpl can wrap over MAXINT, but if so the real tdb
+ * will soon be replaced anyway. For now, just don't handle
+ * this edge case.
+ */
+#define RPL_INCR 16384
+ pt->rpl = htonl(tdb->tdb_rpl + (output ? RPL_INCR : 0));
+ pt->cur_bytes = htobe64(tdb->tdb_cur_bytes);
+
+ if (h->count == sc->sc_maxcount ||
+ (sc->sc_maxupdates && (pt->updates >= sc->sc_maxupdates)))
+ ret = pfsync_tdb_sendout(sc);
+
+ splx(s);
+ return (ret);
+}
+#endif
+
+static int
+sysctl_net_inet_pfsync_stats(SYSCTLFN_ARGS)
+{
+
+ return (NETSTAT_SYSCTL(pfsyncstat_percpu, PFSYNC_NSTATS));
+}
+
+SYSCTL_SETUP(sysctl_net_inet_pfsync_setup, "sysctl net.inet.pfsync
subtree setup")
+{
+
+ sysctl_createv(clog, 0, NULL, NULL,
+ CTLFLAG_PERMANENT,
+ CTLTYPE_NODE, "net", NULL,
+ NULL, 0, NULL, 0,
+ CTL_NET, CTL_EOL);
+ sysctl_createv(clog, 0, NULL, NULL,
+ CTLFLAG_PERMANENT,
+ CTLTYPE_NODE, "inet", NULL,
+ NULL, 0, NULL, 0,
+ CTL_NET, PF_INET, CTL_EOL);
+ sysctl_createv(clog, 0, NULL, NULL,
+ CTLFLAG_PERMANENT,
+ CTLTYPE_NODE, "pfsync",
+ SYSCTL_DESCR("pfsync related settings"),
+ NULL, 0, NULL, 0,
+ CTL_NET, PF_INET, IPPROTO_PFSYNC, CTL_EOL);
+ sysctl_createv(clog, 0, NULL, NULL,
+ CTLFLAG_PERMANENT|CTLFLAG_READONLY,
+ CTLTYPE_STRUCT, "stats",
+ SYSCTL_DESCR("pfsync statistics"),
+ sysctl_net_inet_pfsync_stats, 0, NULL, 0,
+ CTL_NET, PF_INET, IPPROTO_PFSYNC,
+ CTL_CREATE, CTL_EOL);
+}
diff --git a/sys/dist/pf/net/if_pfsync.h b/sys/dist/pf/net/if_pfsync.h
new file mode 100644
index 0000000..d42a2f0
--- /dev/null
+++ b/sys/dist/pf/net/if_pfsync.h
@@ -0,0 +1,284 @@
+/* $NetBSD: if_pfsync.h Exp $ */
+/* $OpenBSD: if_pfsync.h,v 1.31 2007/05/31 04:11:42 mcbride Exp $ */
+
+/*
+ * Copyright (c) 2001 Michael Shalayeff
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE AUTHOR OR HIS RELATIVES BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF MIND, USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _NET_IF_PFSYNC_H_
+#define _NET_IF_PFSYNC_H_
+
+#define INADDR_PFSYNC_GROUP __IPADDR(0xe00000f0) /* 224.0.0.240 */
+
+#define PFSYNC_ID_LEN sizeof(u_int64_t)
+
+struct pfsync_tdb {
+ u_int32_t spi;
+ union sockaddr_union dst;
+ u_int32_t rpl;
+ u_int64_t cur_bytes;
+ u_int8_t sproto;
+ u_int8_t updates;
+ u_int8_t pad[2];
+} __packed;
+
+struct pfsync_state_upd {
+ u_int32_t id[2];
+ struct pfsync_state_peer src;
+ struct pfsync_state_peer dst;
+ u_int32_t creatorid;
+ u_int32_t expire;
+ u_int8_t timeout;
+ u_int8_t updates;
+ u_int8_t pad[6];
+} __packed;
+
+struct pfsync_state_del {
+ u_int32_t id[2];
+ u_int32_t creatorid;
+ struct {
+ u_int8_t state;
+ } src;
+ struct {
+ u_int8_t state;
+ } dst;
+ u_int8_t pad[2];
+} __packed;
+
+struct pfsync_state_upd_req {
+ u_int32_t id[2];
+ u_int32_t creatorid;
+ u_int32_t pad;
+} __packed;
+
+struct pfsync_state_clr {
+ char ifname[IFNAMSIZ];
+ u_int32_t creatorid;
+ u_int32_t pad;
+} __packed;
+
+struct pfsync_state_bus {
+ u_int32_t creatorid;
+ u_int32_t endtime;
+ u_int8_t status;
+#define PFSYNC_BUS_START 1
+#define PFSYNC_BUS_END 2
+ u_int8_t pad[7];
+} __packed;
+
+#ifdef _KERNEL
+
+union sc_statep {
+ struct pfsync_state *s;
+ struct pfsync_state_upd *u;
+ struct pfsync_state_del *d;
+ struct pfsync_state_clr *c;
+ struct pfsync_state_bus *b;
+ struct pfsync_state_upd_req *r;
+};
+
+union sc_tdb_statep {
+ struct pfsync_tdb *t;
+};
+
+extern int pfsync_sync_ok;
+
+struct pfsync_softc {
+ struct ifnet sc_if;
+ struct ifnet *sc_sync_ifp;
+
+ struct ip_moptions sc_imo;
+ struct callout sc_tmo;
+ struct callout sc_tdb_tmo;
+ struct callout sc_bulk_tmo;
+ struct callout sc_bulkfail_tmo;
+ struct in_addr sc_sync_peer;
+ struct in_addr sc_sendaddr;
+ struct mbuf *sc_mbuf; /* current cumulative mbuf */
+ struct mbuf *sc_mbuf_net; /* current cumulative mbuf */
+ struct mbuf *sc_mbuf_tdb; /* dito for TDB updates */
+ union sc_statep sc_statep;
+ union sc_statep sc_statep_net;
+ union sc_tdb_statep sc_statep_tdb;
+ u_int32_t sc_ureq_received;
+ u_int32_t sc_ureq_sent;
+ struct pf_state *sc_bulk_send_next;
+ struct pf_state *sc_bulk_terminator;
+ int sc_bulk_tries;
+ int sc_maxcount; /* number of states in mtu */
+ int sc_maxupdates; /* number of updates/state */
+};
+
+extern struct pfsync_softc *pfsyncif;
+#endif
+
+
+struct pfsync_header {
+ u_int8_t version;
+#define PFSYNC_VERSION 3
+ u_int8_t af;
+ u_int8_t action;
+#define PFSYNC_ACT_CLR 0 /* clear all states */
+#define PFSYNC_ACT_INS 1 /* insert state */
+#define PFSYNC_ACT_UPD 2 /* update state */
+#define PFSYNC_ACT_DEL 3 /* delete state */
+#define PFSYNC_ACT_UPD_C 4 /* "compressed" state update */
+#define PFSYNC_ACT_DEL_C 5 /* "compressed" state delete */
+#define PFSYNC_ACT_INS_F 6 /* insert fragment */
+#define PFSYNC_ACT_DEL_F 7 /* delete fragments */
+#define PFSYNC_ACT_UREQ 8 /* request "uncompressed" state */
+#define PFSYNC_ACT_BUS 9 /* Bulk Update Status */
+#define PFSYNC_ACT_TDB_UPD 10 /* TDB replay counter update */
+#define PFSYNC_ACT_MAX 11
+ u_int8_t count;
+ u_int8_t pf_chksum[PF_MD5_DIGEST_LENGTH];
+} __packed;
+
+#define PFSYNC_BULKPACKETS 1 /* # of packets per timeout */
+#define PFSYNC_MAX_BULKTRIES 12
+#define PFSYNC_HDRLEN sizeof(struct pfsync_header)
+#define PFSYNC_ACTIONS \
+ "CLR ST", "INS ST", "UPD ST", "DEL ST", \
+ "UPD ST COMP", "DEL ST COMP", "INS FR", "DEL FR", \
+ "UPD REQ", "BLK UPD STAT", "TDB UPD"
+
+#define PFSYNC_DFLTTL 255
+
+#define PFSYNC_STAT_IPACKETS 0 /* total input packets, IPv4 */
+#define PFSYNC_STAT_IPACKETS6 1 /* total input packets, IPv6 */
+#define PFSYNC_STAT_BADIF 2 /* not the right interface */
+#define PFSYNC_STAT_BADTTL 3 /* TTL is not PFSYNC_DFLTTL */
+#define PFSYNC_STAT_HDROPS 4 /* packets shorter than hdr */
+#define PFSYNC_STAT_BADVER 5 /* bad (incl unsupp) version */
+#define PFSYNC_STAT_BADACT 6 /* bad action */
+#define PFSYNC_STAT_BADLEN 7 /* data length does not match */
+#define PFSYNC_STAT_BADAUTH 8 /* bad authentication */
+#define PFSYNC_STAT_STALE 9 /* stale state */
+#define PFSYNC_STAT_BADVAL 10 /* bad values */
+#define PFSYNC_STAT_BADSTATE 11 /* insert/lookup failed */
+#define PFSYNC_STAT_OPACKETS 12 /* total output packets, IPv4 */
+#define PFSYNC_STAT_OPACKETS6 13 /* total output packets, IPv6 */
+#define PFSYNC_STAT_ONOMEM 14 /* no memory for an mbuf */
+#define PFSYNC_STAT_OERRORS 15 /* ip output error */
+
+#define PFSYNC_NSTATS 16
+
+/*
+ * Configuration structure for SIOCSETPFSYNC SIOCGETPFSYNC
+ */
+struct pfsyncreq {
+ char pfsyncr_syncdev[IFNAMSIZ];
+ struct in_addr pfsyncr_syncpeer;
+ int pfsyncr_maxupdates;
+ int pfsyncr_authlevel;
+};
+
+
+/* for copies to/from network */
+#define pf_state_peer_hton(s,d) do { \
+ (d)->seqlo = htonl((s)->seqlo); \
+ (d)->seqhi = htonl((s)->seqhi); \
+ (d)->seqdiff = htonl((s)->seqdiff); \
+ (d)->max_win = htons((s)->max_win); \
+ (d)->mss = htons((s)->mss); \
+ (d)->state = (s)->state; \
+ (d)->wscale = (s)->wscale; \
+ if ((s)->scrub) { \
+ (d)->scrub.pfss_flags = \
+ htons((s)->scrub->pfss_flags & PFSS_TIMESTAMP); \
+ (d)->scrub.pfss_ttl = (s)->scrub->pfss_ttl; \
+ (d)->scrub.pfss_ts_mod = htonl((s)->scrub->pfss_ts_mod);\
+ (d)->scrub.scrub_flag = PFSYNC_SCRUB_FLAG_VALID; \
+ } \
+} while (0)
+
+#define pf_state_peer_ntoh(s,d) do { \
+ (d)->seqlo = ntohl((s)->seqlo); \
+ (d)->seqhi = ntohl((s)->seqhi); \
+ (d)->seqdiff = ntohl((s)->seqdiff); \
+ (d)->max_win = ntohs((s)->max_win); \
+ (d)->mss = ntohs((s)->mss); \
+ (d)->state = (s)->state; \
+ (d)->wscale = (s)->wscale; \
+ if ((s)->scrub.scrub_flag == PFSYNC_SCRUB_FLAG_VALID && \
+ (d)->scrub != NULL) { \
+ (d)->scrub->pfss_flags = \
+ ntohs((s)->scrub.pfss_flags) & PFSS_TIMESTAMP; \
+ (d)->scrub->pfss_ttl = (s)->scrub.pfss_ttl; \
+ (d)->scrub->pfss_ts_mod = ntohl((s)->scrub.pfss_ts_mod);\
+ } \
+} while (0)
+
+#define pf_state_host_hton(s,d) do { \
+ bcopy(&(s)->addr, &(d)->addr, sizeof((d)->addr)); \
again
+ (d)->port = (s)->port; \
+} while (0)
+
+#define pf_state_host_ntoh(s,d) do { \
+ bcopy(&(s)->addr, &(d)->addr, sizeof((d)->addr)); \
again
+ (d)->port = (s)->port; \
+} while (0)
+
+#define pf_state_counter_hton(s,d) do { \
+ d[0] = htonl((s>>32)&0xffffffff); \
+ d[1] = htonl(s&0xffffffff); \
+} while (0)
+
+#define pf_state_counter_ntoh(s,d) do { \
+ d = ntohl(s[0]); \
+ d = d<<32; \
+ d += ntohl(s[1]); \
+} while (0)
+
+#ifdef _KERNEL
+void pfsync_input(struct mbuf *, ...);
+int pfsync_clear_states(u_int32_t, char *);
+int pfsync_pack_state(u_int8_t, struct pf_state *, int);
+#define pfsync_insert_state(st) do { \
+ if ((st->rule.ptr->rule_flag & PFRULE_NOSYNC) || \
+ (st->state_key->proto == IPPROTO_PFSYNC)) \
+ st->sync_flags |= PFSTATE_NOSYNC; \
+ else if (!st->sync_flags) \
+ pfsync_pack_state(PFSYNC_ACT_INS, (st), \
+ PFSYNC_FLAG_COMPRESS); \
+ st->sync_flags &= ~PFSTATE_FROMSYNC; \
+} while (0)
+#define pfsync_update_state(st) do { \
+ if (!st->sync_flags) \
+ pfsync_pack_state(PFSYNC_ACT_UPD, (st), \
+ PFSYNC_FLAG_COMPRESS); \
+ st->sync_flags &= ~PFSTATE_FROMSYNC; \
+} while (0)
+#define pfsync_delete_state(st) do { \
+ if (!st->sync_flags) \
+ pfsync_pack_state(PFSYNC_ACT_DEL, (st), \
+ PFSYNC_FLAG_COMPRESS); \
+} while (0)
+#ifdef NOTYET
+int pfsync_update_tdb(struct tdb *, int);
+#endif /* NOTYET */
+#endif
+
+#endif /* _NET_IF_PFSYNC_H_ */
diff --git a/sys/dist/pf/net/pf.c b/sys/dist/pf/net/pf.c
index b95c85c..625287a 100644
Looks good to me otherwise! Thanks for all the work.

christos


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...