About bridges

Discussion:

About bridges

(too old to reply)

Staffan Thomén

2021-05-08 12:51:14 UTC

Hey!

I was toying with bridging three of the four ethernet ports on an pcengines
apu4d4 recently and bumped into something that made me unsure if I understood
how to use bridges correctly.

My setup is like this:

wm1 configured with an address
wm2 no address, just up
wm3 same as 2

and the bridge:

brconfig bridge0 add wm1 stp wm1 add wm2 stp wm2 add wm3 stp wm3

This all works, packets flow and addresses are learned on all interfaces, but
ONLY if there is a cable with a link on wm1. It seems to be that if there is
no link on wm1 the interface (status: no carrier and address shows <DETACHED>)
gets disabled in the bridge, which kind of sucks because then the host has no
address and communication stops.

I tried looking for pseudo devices that I could use on the bridge for the host
interface so that it wouldn't matter which port you plug a network in to, but
the only thing I could attach to the bridge was a tap device and using that as
the host if doesn't seem to work.

The only other devices that I've done bridging on (except my Xen host, but
that's a load of virtual interfaces connected to one real one which always had
a link) are Mikrotik routerboards and there you could set an address on the
bridge device itself; something that you cannot do on netbsd, apparently.

So how is this supposed to work? Can I force the interface with an address to
stay enabled somehow or is there a pseudo-device that I haven't found that I
should be using?

Staffan

This message was originally sent to netbsd-users, but I was directed to repost
here instead

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Robert Swindells

2021-05-08 13:12:24 UTC

Permalink

Post by Staffan ThomÃ©n
I was toying with bridging three of the four ethernet ports on an
pcengines apu4d4 recently and bumped into something that made me unsure
if I understood how to use bridges correctly.
wm1 configured with an address
wm2 no address, just up
wm3 same as 2
brconfig bridge0 add wm1 stp wm1 add wm2 stp wm2 add wm3 stp wm3
This all works, packets flow and addresses are learned on all
interfaces, but ONLY if there is a cable with a link on wm1. It seems to
be that if there is no link on wm1 the interface (status: no carrier and
address shows <DETACHED>) gets disabled in the bridge, which kind of
sucks because then the host has no address and communication stops.

[snip]

Post by Staffan ThomÃ©n
So how is this supposed to work? Can I force the interface with an
address to stay enabled somehow or is there a pseudo-device that I
haven't found that I should be using?

My reading is that this is caused by revision 1.175 of if_bridge.c which
explicitly changed bridge(4) to behave in this way.

I have been caught out by this change in a different way. I just use
dhcpcd(8) to configure IPv6 and have static IPv4 addresses on everything
on my LAN.

When I add the upstream interface to a bridge this toggles the status
on the interface and something in dhcpcd(8) or resolvconf(8) overwrites
my /etc/resolve.conf with an empty one.

I think we should revert this change, any other opinions ?

Robert Swindells

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Greg Troxel

2021-05-08 14:14:00 UTC

Permalink

Post by Robert Swindells

Post by Staffan ThomÃ©n
brconfig bridge0 add wm1 stp wm1 add wm2 stp wm2 add wm3 stp wm3
This all works, packets flow and addresses are learned on all
interfaces, but ONLY if there is a cable with a link on wm1. It seems to
be that if there is no link on wm1 the interface (status: no carrier and
address shows <DETACHED>) gets disabled in the bridge, which kind of
sucks because then the host has no address and communication stops.

[snip]

Post by Staffan ThomÃ©n
So how is this supposed to work? Can I force the interface with an
address to stay enabled somehow or is there a pseudo-device that I
haven't found that I should be using?

My reading is that this is caused by revision 1.175 of if_bridge.c which
explicitly changed bridge(4) to behave in this way.

I can see the point of considering an address detached if link is down
and it's from DHCP. For static, I don't understand the reasoning.

bridges are a layer2 thing; they shouldn't care about the presence of IP
addresses.

Post by Robert Swindells
I have been caught out by this change in a different way. I just use
dhcpcd(8) to configure IPv6 and have static IPv4 addresses on everything
on my LAN.
When I add the upstream interface to a bridge this toggles the status
on the interface and something in dhcpcd(8) or resolvconf(8) overwrites
my /etc/resolve.conf with an empty one.

That seems like a different issue. If you want dhcp processing to only
set resolv.conf when hearing on one particular interface, you should be
able to configure it. But maybe I'm just not following.

Jonathan A. Kollasch

2021-05-08 17:15:15 UTC

Permalink

Post by Hector
I read that OpenBSD has a vether pseudo-device which is specifically
intended for these kinds of bridge situations.
https://man.openbsd.org/vether
It would be nice if NetBSD had something equivalent to OpenBSD vether

https://man.netbsd.org/vether.4

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Michael van Elst

2021-05-08 23:21:07 UTC

Permalink

Post by Hector
On Sat, 8 May 2021 12:15:15 -0500

Post by Jonathan A. Kollasch
https://man.netbsd.org/vether.4

I see that vether is a recent addition, is only in NetBSD current, and is
not in 9.1 or earlier releases.

It wasn't necsessary, tap did just work as a virtual interface.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Hector

2021-05-08 14:38:59 UTC

Permalink

On Sat, 08 May 2021 14:12:24 +0100

Post by Robert Swindells

[snip]

Post by Staffan ThomÃ©n
So how is this supposed to work? Can I force the interface with an
address to stay enabled somehow or is there a pseudo-device that I
haven't found that I should be using?

My reading is that this is caused by revision 1.175 of if_bridge.c which
explicitly changed bridge(4) to behave in this way.
I have been caught out by this change in a different way. I just use
dhcpcd(8) to configure IPv6 and have static IPv4 addresses on everything
on my LAN.
When I add the upstream interface to a bridge this toggles the status
on the interface and something in dhcpcd(8) or resolvconf(8) overwrites
my /etc/resolve.conf with an empty one.
I think we should revert this change, any other opinions ?
Robert Swindells

I also recently discovered this behavior. I tried to bridge a WiFi interface
and an ethernet interface on my NetBSD router, so both wireless and wired
hosts would be on the same subnet. I was quite puzzled when if the
ethernet cable was unplugged, wireless hosts could not communicate. Through
experimentation I figured out that when there was no carrier on the wired
interface, the IP address and route went away.

I then thought that a solution would be to put the IP address on some
pseudo-device also attached to the bridge, but could not find something
suitable.

I read that OpenBSD has a vether pseudo-device which is specifically
intended for these kinds of bridge situations.
https://man.openbsd.org/vether
It would be nice if NetBSD had something equivalent to OpenBSD vether

I am still looking for a solution to this problem.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Staffan Thomén

2021-05-08 18:19:05 UTC

Permalink

Post by Jonathan A. Kollasch

https://man.netbsd.org/vether.4

Well, this solves my problem quite neatly.

Thanks!

Staffan

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Staffan Thomén

2021-05-09 09:45:25 UTC

Permalink

Post by Mouse

Post by Hector
I then thought that a solution would be to put the IP address on some
pseudo-device also attached to the bridge, but could not find something
suitable.

Perhaps some way to configure a tap device so that it works for these

...

Post by Mouse
It wasn't necsessary, tap did just work as a virtual interface.

So given these hints, tap IS supposed to work as a virtual ethernet interface.

I tried setting this up on my laptop running 9.0_STABLE from Sep 2 2020
(the apu is slated to be installed today so I don't want to do more fiddling
with it) with this config:

interface wm0 up no other config
interface tap0 -"-

brconfig bridge0 add wm0 stp wm0 add tun0

I deliberately did not have dhcpcd running.

After bridge had pondered the stp situation the wm0 interface got to
forwarding, and everything seemed ok.

However as I ran dhcpcd -1 tap0, only ipv6 was configured on the tap interface.

With some judicious tcpdumping I could see that the dhcp requests were leaving
tap0, passing through the bridge and leaving on the wm0 to arrive on the dhcp
server, the replies however left the dhcp server with the right mac, arrived
on the wm0 but never made it to the tap0 device.

If I manually configure tap0 to have an ipv4 address, it responds to ping, but
no icmp packets are seen in the tcpdump (wtf). It also works to talk TCP (I
tried ssh), but no packets from that again visible in the dump on tap0.

If i just dump everything on tap0 I can see some lldp packets the mikrotik
ap/switch it is plugged into sends, various (v4) multicast noise and lots of
ipv6 ndp and ra packets (as expected)

What am I missing now? DHCP replies seem to go missing alltogether and some
ipv4 packets aren't shown in tcpdump on the tap interface.

ipv6 on the other hand, works magically.

This was presumably why I thought tap didn't work as a host interface when I
was fiddling with the apu.

Staffan

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Mouse

2021-05-08 16:36:47 UTC

Permalink

Post by Hector
I then thought that a solution would be to put the IP address on some
pseudo-device also attached to the bridge, but could not find
something suitable.

Perhaps some way to configure a tap device so that it works for these
purposes, acting as though there's always a userland program talking to
it even if there isn't? I have a hazy memory that I once changed tap
to do that, but can't find evidence of it now; I am probably conflating
two memories.

I haven't tried it - no convenient way to right now - but perhaps just
a "sleep 999999 <> /dev/tap$N &" would be enough to keep tap$N happy?

Mouse

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Staffan Thomén

2021-05-29 20:06:43 UTC

Permalink

No takers for this? Can someone at least check that I'm not crazy?

I also tried backporting the ether driver, and with a tiny change to packet
accounting it just compiled. It also showed the exact same symptoms.
NetBSD-9.2 also behaves the exact same way with tap.

Staffan

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Hector

2021-05-08 21:20:32 UTC

Permalink

On Sat, 8 May 2021 12:15:15 -0500

Post by Jonathan A. Kollasch

https://man.netbsd.org/vether.4

I see that vether is a recent addition, is only in NetBSD current, and is
not in 9.1 or earlier releases.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Greg Troxel

2021-05-30 14:27:11 UTC

Permalink

It works partially; DHCP replies don't seem to show up on the tap device so it
does not configure an address. I kept a tcpdump on the tap while running
dhcpcd and I can see dhcpcd retrying the sending to no avail.
Naturally I checked that my DHCP server sends the replies and it does. They
arrive on the real interface on the bridge under test (another simultaneous
tcpdump) but do not seem to pass through to the tap interface. As I
understand it, broadcast packets should be sent to every interface on the
bridge.

At this point I think it's probably time for you to dig into debugging
the kernel. It sounds like there might be a problem

Additionally, even if I manually configure an address on the tap interface and
I can use it to connect using ssh, tcpdump on the tap device doesn't show the
unicast packets, which I don't think is right either.

The problem could be about the bpf_tap calls as well as forwarding.

/etc/rc.d/dhcpcd stop, remove all addresses from all interfaces.
ifconfig tap0 create up
ifconfig bridge0 create up
brconfig bridge0 add iwn0 stp iwn0 add tap0
<wait until bridge settles into forwarding>
<in another terminal, start tcpdump -i tap0>
dhcpcd -1 tap0

I am unclear on the details, but it's not immediately obvious what
running STP on one interface and not another interface of a bridge
means. I think it's just that the protocol is spoken on one and thus
it's eligible to be disabled, and that isn't true of the other. A
bridge with tap0 and physical0 can't have a loop (unless you make a
second bridge or something) so you could leave that out to simplify.

Watch dhcpcd properly configure IPv6 and finally add a zeroconf address as no
IPv4 broadcast reply from the server arrived.
In my initial test I used the wired wm0 interface, and it acts the same as with
the wireless.
It's entirely possible that I've fudged up something else in my network, I'd be
the first to say I don't know what I'm doing but I find that unlikely as the same
DHCP server servs a dozen or so clients on this network, some of which are
Xen domU:s that also live behind a bridge.

My only hint is don't assume that you must be doing something wrong.
It's possible that bridge support has gotten broken somehow. I had
trouble with it under I think 7 and 8 with vr0 (net5501) and I just
stopped using it. There, it was about which mac address was used for
NDP and I would get broken v6 connectivity because hosts on e.g. vr0
were using the ethernet address of vr2 in the IPv6 NDP tables, or
something like that.

One issue is that tap0 will show "status: no carrier", presumably
because the controlling device isn't open. That tends to cause various
things to behave differently.

So maybe you want something like tap, but that thinks it is connected to
an ethernet that is valid but has nothing else. That may be doable by
opening /dev/tap0 and just throwing away read data, and it might be that
tap should have link0 flag that means 'pretend there is carrier'.

Staffan Thomén

2021-06-04 18:13:18 UTC

Permalink

At this point I think it's probably time for you to dig into debugging the
kernel. It sounds like there might be a problem

Sure, but I was hoping that someone who already is familiar with kernel
programming and the bits involved would wake up and save me the trouble :-)

[tcpdump doesn't show all packets on tap]

The problem could be about the bpf_tap calls as well as forwarding.

Alright, I don't know how that works.

I am unclear on the details, but it's not immediately obvious what running
STP on one interface and not another interface of a bridge means. I think
it's just that the protocol is spoken on one and thus it's eligible to be
disabled, and that isn't true of the other. A bridge with tap0 and
physical0 can't have a loop (unless you make a second bridge or something)
so you could leave that out to simplify.

I tried leaving it out but it just makes configuring instant because the bridge
doesn't have to go through the listening/learning phases before forwarding
packets.
AFAIK STP couldn't hurt anything, but you're absolutely right to invoke
occam's razor here.

My only hint is don't assume that you must be doing something wrong. It's
possible that bridge support has gotten broken somehow. I had trouble with
it under I think 7 and 8 with vr0 (net5501) and I just stopped using it.
There, it was about which mac address was used for NDP and I would get
broken v6 connectivity because hosts on e.g. vr0 were using the ethernet
address of vr2 in the IPv6 NDP tables, or something like that.
One issue is that tap0 will show "status: no carrier", presumably because
the controlling device isn't open. That tends to cause various things to
behave differently.
So maybe you want something like tap, but that thinks it is connected to an
ethernet that is valid but has nothing else. That may be doable by
opening /dev/tap0 and just throwing away read data, and it might be that
tap should have link0 flag that means 'pretend there is carrier'.

Thanks for the confidence :-)

I'm not sure what you mean about carrier though, ifconfig tap0 has no status
field and always seems to be up (no tentative addresses for instance, which
was the problem with configured-but-without-cable real interfaces somewhere
in the beginning of this thread).

Staffan

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de