Discussion:
FAST_IPSEC doesn't send ICMP frag needed
(too old to reply)
Dave Huang
2013-12-20 04:26:42 UTC
Permalink
It appears that (FAST_)IPSEC doesn't send ICMP fragmentation needed
when it gets a Don't Fragment packet that needs to be fragmented
because of encapsulation overhead. Beverly Schwartz posted an analysis
of the problem last year
<http://mail-index.netbsd.org/tech-net/2012/10/16/msg003674.html>, but
nobody said anything :(

Does anyone have thoughts about the proposed fix in that message? My
IPSEC tunnel isn't working right, probably due to AT&T lossage--
AFAICT, fragmented packets aren't making it out... tcpdump on my end
of the tunnel shows an ESP packet fragmented into two pieces being
sent out. However, only the first fragment makes it to the other end
of the tunnel; the second one is nowhere to be seen. Fragmented pings
(not in an IPSEC tunnel) aren't making it out either. I'd like to work
around it by not sending fragmented packets, but PMTUD is broken
because NetBSD isn't sending ICMP fragmentation needed.
--
Name: Dave Huang | Mammal, mammal / their names are called /
INet: ***@azeotrope.org | they raise a paw / the bat, the cat /
FurryMUCK: Dahan | dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 38 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Dave Huang
2013-12-20 06:11:26 UTC
Permalink
BTW, here's Beverly's change in patch form, updated for -current, plus
a change that makes the ICMP frag needed contain the route MTU instead
of interface MTU if there is one. (See my earlier message:
<http://mail-index.netbsd.org/tech-net/2013/12/19/msg004418.html>). It
also takes into account Beverly's comment that ipsec_hdrsiz returns a
strange (meaning odd number) header length. I think this should be
decreased until we have a multiple of 4." I'm assuming she meant that
the header length should be *increased* to a multiple of 4 (so the
reported MTU is decreased).

And since ip_forward() was already getting the MTU, I figure there's
no need for ipsec4_forward() to do it again... especially since it
doesn't actually work (sp->req->sav is NULL in ipsec4_forward()).

Index: netinet/ip_input.c
===================================================================
RCS file: /cvsroot/src/sys/netinet/ip_input.c,v
retrieving revision 1.308
diff -u -r1.308 ip_input.c
--- netinet/ip_input.c 29 Jun 2013 21:06:58 -0000 1.308
+++ netinet/ip_input.c 20 Dec 2013 06:04:33 -0000
@@ -1335,7 +1335,8 @@
code = ICMP_UNREACH_NEEDFRAG;

if ((rt = rtcache_validate(&ipforward_rt)) != NULL)
- destmtu = rt->rt_ifp->if_mtu;
+ destmtu = rt->rt_rmx.rmx_mtu ?
+ rt->rt_rmx.rmx_mtu : rt->rt_ifp->if_mtu;
#ifdef IPSEC
(void)ipsec4_forward(mcopy, &destmtu);
#endif
Index: netipsec/ipsec.c
===================================================================
RCS file: /cvsroot/src/sys/netipsec/ipsec.c,v
retrieving revision 1.60
diff -u -r1.60 ipsec.c
--- netipsec/ipsec.c 8 Jun 2013 13:50:22 -0000 1.60
+++ netipsec/ipsec.c 20 Dec 2013 06:04:34 -0000
@@ -806,6 +806,17 @@
}

/*
+ * Check that MTU is sufficient.
+ */
+ if (ntohs(ip->ip_off) & IP_DF) {
+ size_t ipsec_hdrlen = ipsec_hdrsiz(sp);
+ if (ntohs(ip->ip_len) + ipsec_hdrlen > *mtu) {
+ splx(s);
+ return EMSGSIZE;
+ }
+ }
+
+ /*
* Do delayed checksums now because we send before
* this is done in the normal processing path.
*/
@@ -912,24 +923,10 @@
return EINVAL;
}

- /* Count IPsec header size. */
- ipsechdr = ipsec4_hdrsiz(m, IPSEC_DIR_OUTBOUND, NULL);
+ /* Count IPsec header size, rounded up to multiple of 4. */
+ ipsechdr = roundup2(ipsec4_hdrsiz(m, IPSEC_DIR_OUTBOUND, NULL), 4);
+ *destmtu -= ipsechdr;

- /*
- * Find the correct route for outer IPv4 header, compute tunnel MTU.
- */
- if (sp->req && sp->req->sav && sp->req->sav->sah) {
- struct route *ro;
- struct rtentry *rt;
-
- ro = &sp->req->sav->sah->sa_route;
- rt = rtcache_validate(ro);
- if (rt && rt->rt_ifp) {
- *destmtu = rt->rt_rmx.rmx_mtu ?
- rt->rt_rmx.rmx_mtu : rt->rt_ifp->if_mtu;
- *destmtu -= ipsechdr;
- }
- }
KEY_FREESP(&sp);
return 0;
}
--
Name: Dave Huang | Mammal, mammal / their names are called /
INet: ***@azeotrope.org | they raise a paw / the bat, the cat /
FurryMUCK: Dahan | dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 38 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Greg Troxel
2013-12-20 16:04:16 UTC
Permalink
Post by Dave Huang
It appears that (FAST_)IPSEC doesn't send ICMP fragmentation needed
when it gets a Don't Fragment packet that needs to be fragmented
because of encapsulation overhead. Beverly Schwartz posted an analysis
of the problem last year
<http://mail-index.netbsd.org/tech-net/2012/10/16/msg003674.html>, but
nobody said anything :(
I think that was partly because this is hard and not that many people
are having trouble and partly because at least for some of the issues
there are spec ambiguities.

Thanks for sending the updated patch. It is probably better to apply a
sensible change that helps rather than dither forever about the perfect
fix. (I won't get a change to look at this for a few weeks.)
Objections?
Greg Troxel
2013-12-20 16:06:44 UTC
Permalink
Post by Dave Huang
And since ip_forward() was already getting the MTU, I figure there's
no need for ipsec4_forward() to do it again... especially since it
doesn't actually work (sp->req->sav is NULL in ipsec4_forward()).
I think the concept is that a packet that would be routed out one
interface matches an SPD entry and can get put in a tunnel that causes
the encapsulated packet to be sent out a different interface. Really
only the interface that gets the tunnel packet should matter (and the
route MTU for that outer dst).
Dave Huang
2013-12-20 17:22:57 UTC
Permalink
Post by Greg Troxel
I think the concept is that a packet that would be routed out one
interface matches an SPD entry and can get put in a tunnel that causes
the encapsulated packet to be sent out a different interface. Really
only the interface that gets the tunnel packet should matter (and the
route MTU for that outer dst).
Ah, yeah... after looking into it more, the destmtu from ip_forward()
is for the inner packet's interface/route, not the outer. So something
like the code I deleted is needed, but that exact code doesn't
actually work (and I definitely don't know enough to make it work :)
--
Name: Dave Huang | Mammal, mammal / their names are called /
INet: ***@azeotrope.org | they raise a paw / the bat, the cat /
FurryMUCK: Dahan | dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 38 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
David Young
2014-01-02 23:11:05 UTC
Permalink
Post by Dave Huang
BTW, here's Beverly's change in patch form, updated for -current, plus
a change that makes the ICMP frag needed contain the route MTU instead
<http://mail-index.netbsd.org/tech-net/2013/12/19/msg004418.html>). It
also takes into account Beverly's comment that ipsec_hdrsiz returns a
strange (meaning odd number) header length. I think this should be
decreased until we have a multiple of 4." I'm assuming she meant that
the header length should be *increased* to a multiple of 4 (so the
reported MTU is decreased).
And since ip_forward() was already getting the MTU, I figure there's
no need for ipsec4_forward() to do it again... especially since it
doesn't actually work (sp->req->sav is NULL in ipsec4_forward()).
Index: netinet/ip_input.c
===================================================================
RCS file: /cvsroot/src/sys/netinet/ip_input.c,v
retrieving revision 1.308
diff -u -r1.308 ip_input.c
--- netinet/ip_input.c 29 Jun 2013 21:06:58 -0000 1.308
+++ netinet/ip_input.c 20 Dec 2013 06:04:33 -0000
@@ -1335,7 +1335,8 @@
code = ICMP_UNREACH_NEEDFRAG;
if ((rt = rtcache_validate(&ipforward_rt)) != NULL)
- destmtu = rt->rt_ifp->if_mtu;
+ destmtu = rt->rt_rmx.rmx_mtu ?
+ rt->rt_rmx.rmx_mtu : rt->rt_ifp->if_mtu;
#ifdef IPSEC
(void)ipsec4_forward(mcopy, &destmtu);
#endif
I think this part of the patch is incorrect for the reasons described
by others in the "ICMP_UNREACH_NEEDFRAG returns iface MTU instead of
route?" thread on tech-net.

Dave
--
David Young
***@pobox.com Urbana, IL (217) 721-9981

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...