Discussion:
Link aggregation between NetBSD agr and Linux bond interfaces
(too old to reply)
BERTRAND Joël
2016-12-07 22:27:29 UTC
Permalink
Hello,

I'm trying to aggregate two openvpn links between a Linux server
(debian) and a NetBSD client (running 7.0.2). I have done some test with
a 7.99.43 kernel/userland on an AlphaStation with the same result.

I suppose I have understood agr capabilities, I use it with a
switch for a long time without trouble.

On Linux side, I have started two openvpn servers (UDP
configuration). Each VPN runs on a different VDSL2 link. I don't have
openvpn configuration issue as this configuration runs fine without
aggregation. I have stopped all firewalls on both servers.

Linux
|
+- eth0 (LAN)
+- eth1 (WAN ISP1)
+- eth2 (WAN ISP2)
+- tap1 (UDP on ISP1)
+- tap2 (UDP on ISP2)

I have added in /etc/network/interfaces:

auto bond0
iface bond0 inet static
address 192.168.1.1
netmask 255.255.255.0
slaves tap1 tap2
bond_mode 4
# 4 = 802.3ad, I have tried round-robin (0)
bond_miimon 100
bond_downdelay 200
bond_updelay 200

and I obtain a bond0 interface.

On NetBSD side, I have tried to configure a new agr interface
without success.

NetBSD
|
+- wm0 (WAN)
+- tap0 (to Linux's tap1)
+- tap1 (to Linux's tap2)
+- agr0 (wm1 + wm2, 802.3ad)

wm1 and wm2 are linked into agr0:

legendre# cat ifconfig.agr0
create
agrport wm1
agrport wm2
inet 192.168.10.128 netmask 255.255.255.0
up
!ifconfig wm1 up
!ifconfig wm2 up
legendre#

I have tried to create a new agr1 interface but it doesn't work as
expected. Maybe I have misunderstood something.

ifconfig create agr1 creates a new interface. I have to add and remove
agrports. Thus, I have written openvpn up and down scripts to add and
remove agrports. Problem : when a tunnel stops, tap interface is
dismounted and ifconfig returns :

agr1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
...
agrport: , flags=0x3<COLLECTING,DISTRIBUTING>
agrport: tap1, flags=0x3<COLLECTING,DISTRIBUTING>
...

Please note that tap0 is replaced by '' and this port cannot be
delete anymore. I have added persist-tun option in openvpn configuration
and problem disappears.

I have created agr interface by hand (openvpn without inet/inet6
parameters and I have added network configuration directly on agr1
interface). I haven't obtain a running aggregated tunnel. I have tried
to use link0 and -link0 parameters without success.

Even with interfaces up on both sides, tunnel is unusable.

Is it possible to write a configuration to aggregate two openVPN
links between a Linux server an a NetBSD client ?

Best regards,

JKB

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Gert Doering
2016-12-08 08:22:23 UTC
Permalink
Hi,
Post by BERTRAND Joël
Even with interfaces up on both sides, tunnel is unusable.
What does "unusable" mean, exactly? Is the agr interface down, are
packets not being sent to tap0/tap1, are they not passed to openvpn,
are they not being received, etc.?

You'll need to do some tcpdumping on all interfaces involved to see
how far the packets get...

What I hear from people in the Linux world, this *should* work (aka
"people have done it successfully on Linux"), though I've never done
it myself. But it should work...

Looking at the man page, agr seems to default to use LACP, which might
or might not be the problem - so I'd start by turning it off ("link1")
to see if a static tunnel works. If that works, check that the linux
side is also speaking LACP and whether that part comes up.

Maybe OpenVPN is getting confused by the LACP frames and not forwarding
them (it *should* be totally transparent in TAP mode, but I'm not sure
anyone has tested LACP-over-TAP yet)

gert
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany ***@greenie.muc.de
fax: +49-89-35655025 ***@net.informatik.tu-muenchen.de

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
BERTRAND Joël
2016-12-08 09:41:10 UTC
Permalink
Post by Gert Doering
Hi,
Post by BERTRAND Joël
Even with interfaces up on both sides, tunnel is unusable.
What does "unusable" mean, exactly? Is the agr interface down, are
packets not being sent to tap0/tap1, are they not passed to openvpn,
are they not being received, etc.?
No connection. Both OpenVPN links run as expected. But even if agr0 is
up (and configured of course with an IP address), no data is received
from aggregated interface. I suppose that NetBSD try to send data, but
Linux host doesn't receive any packets.
Post by Gert Doering
You'll need to do some tcpdumping on all interfaces involved to see
how far the packets get...
Of course, I have tried to check link with tcpdump. WHen I try to ping
Linux host from NetBSD, I obtain on agr0 :
10:34:46.034686 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:47.034689 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:48.034682 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:49.034679 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:50.034666 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:51.034655 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:52.034655 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:53.034647 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:54.034642 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:55.034639 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:56.034624 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:57.034635 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:58.034615 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:59.034613 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:35:00.034612 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28

If I try to ping NetBSD from Linux, I obtain the same message.
Post by Gert Doering
What I hear from people in the Linux world, this *should* work (aka
"people have done it successfully on Linux"), though I've never done
it myself. But it should work...
Looking at the man page, agr seems to default to use LACP, which might
or might not be the problem - so I'd start by turning it off ("link1")
to see if a static tunnel works. If that works, check that the linux
side is also speaking LACP and whether that part comes up.
I have turned of link1. And on linux side, I suppose bond0 is
configured to use LACP :
rayleigh:[~] > cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 200
Down Delay (ms): 200

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
...
Post by Gert Doering
Maybe OpenVPN is getting confused by the LACP frames and not forwarding
them (it *should* be totally transparent in TAP mode, but I'm not sure
anyone has tested LACP-over-TAP yet)
Best regards,

JKB


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Paul Goyette
2016-12-08 09:48:26 UTC
Permalink
Post by BERTRAND Joël
Post by Gert Doering
Hi,
Post by BERTRAND Joël
Even with interfaces up on both sides, tunnel is unusable.
What does "unusable" mean, exactly? Is the agr interface down, are
packets not being sent to tap0/tap1, are they not passed to openvpn,
are they not being received, etc.?
No connection. Both OpenVPN links run as expected. But even if agr0
is up (and configured of course with an IP address), no data is received from
aggregated interface. I suppose that NetBSD try to send data, but Linux host
doesn't receive any packets.
Post by Gert Doering
You'll need to do some tcpdumping on all interfaces involved to see
how far the packets get...
Of course, I have tried to check link with tcpdump. WHen I try to
10:34:46.034686 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:47.034689 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:48.034682 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:49.034679 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:50.034666 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:51.034655 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:52.034655 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:53.034647 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:54.034642 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:55.034639 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:56.034624 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:57.034635 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:58.034615 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:34:59.034613 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
10:35:00.034612 ARP, Request who-has 192.168.100.1 tell 192.168.100.2, length
28
Great. You should also run tcpdump on both member links to determine
whether or not the agr driver is properly putting packets into those
queues.

And if that succeeds, you should then use tcpdump to watch for packets
being sent on the "real" interfaces...

(I think this is what the previous responder meant when he said "run
tcpdump on all of the involved interfaces...)
Post by BERTRAND Joël
If I try to ping NetBSD from Linux, I obtain the same message.
Hopefully, the addresses are reversed! If NetBSD --> Linux says

Request who-has 192.168.100.1 tell 192.168.100.2

then Linux --> NetBSD should say

Request who-has 192.168.100.2 tell 192.168.100.1


:)




+------------------+--------------------------+------------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
BERTRAND Joël
2016-12-08 10:04:50 UTC
Permalink
Post by Paul Goyette
Post by BERTRAND Joël
Post by Gert Doering
Hi,
Post by BERTRAND Joël
Even with interfaces up on both sides, tunnel is unusable.
What does "unusable" mean, exactly? Is the agr interface down, are
packets not being sent to tap0/tap1, are they not passed to openvpn,
are they not being received, etc.?
No connection. Both OpenVPN links run as expected. But even if
agr0 is up (and configured of course with an IP address), no data is
received from aggregated interface. I suppose that NetBSD try to send
data, but Linux host doesn't receive any packets.
Post by Gert Doering
You'll need to do some tcpdumping on all interfaces involved to see
how far the packets get...
Of course, I have tried to check link with tcpdump. WHen I try to
10:34:46.034686 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:47.034689 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:48.034682 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:49.034679 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:50.034666 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:51.034655 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:52.034655 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:53.034647 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:54.034642 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:55.034639 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:56.034624 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:57.034635 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:58.034615 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:59.034613 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:35:00.034612 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
Great. You should also run tcpdump on both member links to determine
whether or not the agr driver is properly putting packets into those
queues.
I run tcpdump of both members. When NetBSD sends an arp request, Linux
doesn't receive this request. And when linux sends request, NetBSD
doesn't receive it.
Post by Paul Goyette
And if that succeeds, you should then use tcpdump to watch for packets
being sent on the "real" interfaces...
(I think this is what the previous responder meant when he said "run
tcpdump on all of the involved interfaces...)
Post by BERTRAND Joël
If I try to ping NetBSD from Linux, I obtain the same message.
Hopefully, the addresses are reversed! If NetBSD --> Linux says
Request who-has 192.168.100.1 tell 192.168.100.2
then Linux --> NetBSD should say
Request who-has 192.168.100.2 tell 192.168.100.1
Of course.

I have tried to directly run tcpdump on tap interfaces :

Root rayleigh:[~] > tcpdump -i tap3 -p (linux)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3, link-type EN10MB (Ethernet), capture size 262144 bytes
11:00:31.834796 LACPv1, length 110
11:00:53.322376 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:54.318759 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:55.318748 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:56.321787 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:57.318751 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:58.318767 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:01:01.850763 LACPv1, length 110
11:01:31.854744 LACPv1, length 110

einstein# tcpdump -i tap0 -p (netbsd)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:00:53.324275 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:54.320839 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:55.320676 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:56.323695 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:57.320671 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:00:58.320727 ARP, Request who-has 192.168.100.2 tell 192.168.100.1,
length 28
11:01:01.852714 LACPv1, length 110
11:01:31.856649 LACPv1, length 110

Root rayleigh:[~] > tcpdump -i tap4 -p
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap4, link-type EN10MB (Ethernet), capture size 262144 bytes
11:03:01.874744 LACPv1, length 110
...
einstein# tcpdump -i tap1 -p
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:03:01.876694 LACPv1, length 110
...

Thus tap3(linux) and tap0(netbsd) are connected. Same constatation for
tap4(linux) and tap1(netbsd).

JKB


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Gert Doering
2016-12-08 10:16:19 UTC
Permalink
Hi,
Post by BERTRAND Joël
Post by Gert Doering
You'll need to do some tcpdumping on all interfaces involved to see
how far the packets get...
On *all* interfaces...
Post by BERTRAND Joël
Of course, I have tried to check link with tcpdump. WHen I try to ping
10:34:46.034686 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
10:34:47.034689 ARP, Request who-has 192.168.100.1 tell 192.168.100.2,
length 28
Now you need to do the tcpdump on tap0 and tap1 to see where the problem
happens - is "agr -> (tap0,tap1)" not working, or is there something
between tap0->openvpn->tap1->bond0 on the other end.

If you see the packets on tap0 or tap1, the NetBSD "agr" part is working,
so then you need to tcpdump on tap1/tap2 on the linux side and the
bond interface there to see who is eating them.
Post by BERTRAND Joël
Post by Gert Doering
Looking at the man page, agr seems to default to use LACP, which might
or might not be the problem - so I'd start by turning it off ("link1")
to see if a static tunnel works. If that works, check that the linux
side is also speaking LACP and whether that part comes up.
I have turned of link1. And on linux side, I suppose bond0 is
Turn link1 *on*. As per the man page, that turns *off* LACP.

Ditto on Linux. LACP is good (because it can notice if one of the taps
isn't working), but it is an extra complication on the step towards
"where is it not working", so initial troubleshooting should be without
LACP.

(Note: if talking to a switch, leave LACP on, of course, to avoid bridge
loops)

gert
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany ***@greenie.muc.de
fax: +49-89-35655025 ***@net.informatik.tu-muenchen.de

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Gert Doering
2016-12-08 10:22:46 UTC
Permalink
Hi,
Post by BERTRAND Joël
Post by Paul Goyette
Great. You should also run tcpdump on both member links to determine
whether or not the agr driver is properly putting packets into those
queues.
I run tcpdump of both members. When NetBSD sends an arp request, Linux
doesn't receive this request. And when linux sends request, NetBSD
doesn't receive it.
There are *10* interfaces involved here. You need to check the full
packet flow to see where it gets lost

It starts with agr0 -> check, you see the packets.

Then you need to do tcpdump on tap0, tap1 on the NetBSD side.

If you see the packets there, run tcpdump on your outgoing "real" interface
to see if the openvpn packets go out (very likely not the problem).

If yes, then run tcpdump on tap1, tap2 on the Linux side, and see if
you can see the ARP query *from the NetBSD* side coming *in*.

And if all this is yes, then tcpdump on the bond interface.


("both members" is not meaning anything, as that could be 4 different
tapping points for each of them)


From the tcpdumps you are showing, it seems as if the packets correctly
traverse agr0->tap0->tap3 - which is good.

You also see LACPv1 frames, which is also good (so the other idea of
disabling LACP to see if that's the culprit should not be necessary).

I infer you do not see the packets agr0 sent out as "incoming" on the bond
interface on the linux side - which hints at "something in the bond
config is not right" (and not on the NetBSD side).

gert
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany ***@greenie.muc.de
fax: +49-89-35655025 ***@net.informatik.tu-muenchen.de

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
BERTRAND Joël
2016-12-08 10:34:34 UTC
Permalink
Post by Gert Doering
Hi,
Then you need to do tcpdump on tap0, tap1 on the NetBSD side.
I have done. When Linux tries to ping NetBSD over bound0/agr0.
192.168.100.1 is Linux IP address and 192.168.100.2 is NetBSD one.

einstein# tcpdump -i tap1 -p -vv (netbsd)
tcpdump: listening on tap1, link-type EN10MB (Ethernet), capture size
262144 bytes
11:24:34.116353 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.2 tell 192.168.100.1, length 28
11:24:36.115263 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.2 tell 192.168.100.1, length 28
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
einstein# tcpdump -i tap0 -p -vv
tcpdump: listening on tap0, link-type EN10MB (Ethernet), capture size
262144 bytes
11:24:41.135257 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.2 tell 192.168.100.1, length 28
11:24:43.135210 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.2 tell 192.168.100.1, length 28
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
einstein# tcpdump -i agr0 -p -vv
tcpdump: listening on agr0, link-type EN10MB (Ethernet), capture size
262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

You can see that both tap0 and tap1 on NetBSD side receive ARP request.

einstein# ifconfig agr0
agr0: flags=0xb843<UP,BROADCAST,RUNNING,SIMPLEX,LINK0,LINK1,MULTICAST>
mtu 1500
agrport: tap0, flags=0x3<COLLECTING,DISTRIBUTING>
agrport: tap1, flags=0x3<COLLECTING,DISTRIBUTING>
address: f2:0b:a4:36:c5:8b
inet 192.168.100.2/24 broadcast 192.168.100.255 flags 0x0
inet6 fe80::f00b:a4ff:fe36:c58b%agr0/64 flags 0x0 scopeid 0x6

Of course, bond0 on linux side is configured to use round robin
algorithm. On NetBSD side, tap0 and tap1 receive packets but these
packets are not transmitted to agr0. Thus, NetBSD cannot answer to ARP
request.

If I try to ping linux server from netbsd, tcpdump (on netbsd side)
returns :
einstein# tcpdump -i agr0 -p -vv
tcpdump: listening on agr0, link-type EN10MB (Ethernet), capture size
262144 bytes
11:28:20.677465 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.1 tell 192.168.100.2, length 28
11:28:21.680338 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.1 tell 192.168.100.2, length 28
11:28:22.679356 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
192.168.100.1 tell 192.168.100.2, length 28
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
einstein# tcpdump -i tap0 -p -vv
tcpdump: listening on tap0, link-type EN10MB (Ethernet), capture size
262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
einstein# tcpdump -i tap1 -p -vv
tcpdump: listening on tap1, link-type EN10MB (Ethernet), capture size
262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
einstein#

Packets are sent on agr0, but not on tap0 and tap1 ! Of course, I have
verified that tap0 and tap1 are agr0 slave interfaces :

einstein# einsteig agr0
agr0: flags=0xb843<UP,BROADCAST,RUNNING,SIMPLEX,LINK0,LINK1,MULTICAST>
mtu 1500
agrport: tap0, flags=0x3<COLLECTING,DISTRIBUTING>
agrport: tap1, flags=0x3<COLLECTING,DISTRIBUTING>
address: f2:0b:a4:36:c5:8b
inet 192.168.100.2/24 broadcast 192.168.100.255 flags 0x0
inet6 fe80::f00b:a4ff:fe36:c58b%agr0/64 flags 0x0 scopeid 0x6
Post by Gert Doering
If you see the packets there, run tcpdump on your outgoing "real" interface
to see if the openvpn packets go out (very likely not the problem).
If yes, then run tcpdump on tap1, tap2 on the Linux side, and see if
you can see the ARP query *from the NetBSD* side coming *in*.
And if all this is yes, then tcpdump on the bond interface.
("both members" is not meaning anything, as that could be 4 different
tapping points for each of them)
From the tcpdumps you are showing, it seems as if the packets correctly
traverse agr0->tap0->tap3 - which is good.
You also see LACPv1 frames, which is also good (so the other idea of
disabling LACP to see if that's the culprit should not be necessary).
I infer you do not see the packets agr0 sent out as "incoming" on the bond
interface on the linux side - which hints at "something in the bond
config is not right" (and not on the NetBSD side).
If I understand tcpdump traces, my trouble comes from NetBSD agr
configuration.

Regards,

JKB

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Gert Doering
2016-12-08 10:41:24 UTC
Permalink
Hi,
Post by BERTRAND Joël
If I understand tcpdump traces, my trouble comes from NetBSD agr
configuration.
I agree. This was not totally clear (to me) from your previous mails.

But indeed, if "packet is sent into agr0" is not making tap0/tap1, and
"packet received by openvpn and then given to tap0/tap1" is not showing
up on agr0, it's "not Linux and not OpenVPN".

Back to idea 1: does turning off LACP (setting link1) make a difference?

gert
--
USENET is *not* the non-clickable part of WWW!
//www.muc.de/~gert/
Gert Doering - Munich, Germany ***@greenie.muc.de
fax: +49-89-35655025 ***@net.informatik.tu-muenchen.de

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
BERTRAND Joël
2016-12-08 10:48:03 UTC
Permalink
Post by Gert Doering
Hi,
Post by BERTRAND Joël
If I understand tcpdump traces, my trouble comes from NetBSD agr
configuration.
I agree. This was not totally clear (to me) from your previous mails.
But indeed, if "packet is sent into agr0" is not making tap0/tap1, and
"packet received by openvpn and then given to tap0/tap1" is not showing
up on agr0, it's "not Linux and not OpenVPN".
Back to idea 1: does turning off LACP (setting link1) make a difference?
No :

einstein# ifconfig agr0
agr0: flags=0xb843<UP,BROADCAST,RUNNING,SIMPLEX,LINK0,LINK1,MULTICAST>
mtu 1500
agrport: tap0, flags=0x3<COLLECTING,DISTRIBUTING>
agrport: tap1, flags=0x3<COLLECTING,DISTRIBUTING>
address: f2:0b:a4:36:c5:8b
inet 192.168.100.2/24 broadcast 192.168.100.255 flags 0x0
inet6 fe80::f00b:a4ff:fe36:c58b%agr0/64 flags 0x0 scopeid 0x6

Of course, I have set link1 without any slave interface.

Regards,

JKB

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...