squid proxy issue

Discussion:

squid proxy issue

(too old to reply)

Tuan Nguyen

2010-10-21 15:05:28 UTC

Hi all,

We're experiencing this strange issue with squid on NetBSD (5.1 RC3).
During busy hours we're seeing repeated error messages in squid's
cache.log:

Failed to select source for 'http://...'
TCP connection to x.x.x.x/3128 failed
TCP connection to x.x.x.x/3128 failed
TCP connection to x.x.x.x/3128 failed
...

x.x.x.x is the upstream proxy as configured in squid.conf: cache_peer
x.x.x.x parent 3128 0 no-query default name=x.x.x.x:3129

As a result, on client machine a user would see "Unable to forward this
request at this time" message on their browser. This doesn't occur when
network is less busy and we're seeing the same issue with different
upstream proxies. Is there anything we could tweak on squid/NetBSD to
overcome this or parent proxies are the one to blame here? Any advice
would be great. Thanks.

Regards,
Tuan

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Manuel Bouyer

2010-10-22 21:01:03 UTC

Permalink

Post by Tuan Nguyen
Hi all,
We're experiencing this strange issue with squid on NetBSD (5.1
RC3). During busy hours we're seeing repeated error messages in
Failed to select source for 'http://...'
TCP connection to x.x.x.x/3128 failed
TCP connection to x.x.x.x/3128 failed
TCP connection to x.x.x.x/3128 failed
...
cache_peer x.x.x.x parent 3128 0 no-query default name=x.x.x.x:3129
As a result, on client machine a user would see "Unable to forward
this request at this time" message on their browser. This doesn't
occur when network is less busy and we're seeing the same issue with
different upstream proxies. Is there anything we could tweak on
squid/NetBSD to overcome this or parent proxies are the one to blame
here? Any advice would be great. Thanks.

Did you check if squid is hitting some ressource limit, maybe
file descriptors ?

--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Joerg Sonnenberger

2010-10-22 21:04:31 UTC

Permalink

Post by Manuel Bouyer
Did you check if squid is hitting some ressource limit, maybe
file descriptors ?

Or sockets in time wait state.

Joerg

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Stephen Borrill

2010-10-25 07:31:24 UTC

Permalink

Post by Joerg Sonnenberger

Post by Manuel Bouyer
Did you check if squid is hitting some ressource limit, maybe
file descriptors ?

Or sockets in time wait state.

Tuan (my co-worked) will correct me if I'm wrong, but it's proving to be
an ipfilter problem. With ipfilter disabled, there are literally zero
errors (we did up file descriptors to 8192 BTW).

On a related issue, we are also seeing very slow rsync transfers under
some circumstances. Again these are fixed by disabling ipfilter.

The ipf.conf files are quite complex and are built up automatically from a
machine-wide configuration file that includes things like:
- ip address and netmasks
- firewall security level
- NAT on/off

I've recommended to start from a simple ipf.conf (basically just pass in
all/pass out all) and build up from there to see if we can work out what
triggers the problem.

--
Stephen

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Stephen Borrill

2010-11-09 11:50:31 UTC

Permalink

Post by Joerg Sonnenberger

Post by Manuel Bouyer
Did you check if squid is hitting some ressource limit, maybe
file descriptors ?

Or sockets in time wait state.

Tuan (my co-worked) will correct me if I'm wrong, but it's proving to be an
ipfilter problem. With ipfilter disabled, there are literally zero errors (we
did up file descriptors to 8192 BTW).

For the record, this turned out to be exhaustion of the ipfilter state
table.

From sys/dist/ipf/netinet/ip_state.h:

# define IPSTATE_SIZE 5737
# define IPSTATE_MAX 4013 /* Maximum number of states held */

These need to be primes with IPSTATE_MAX being about 70% of IPSTATE_SIZE.
So I increased these by adding the following to my kernel config file:
options IPSTATE_SIZE=30011
options IPSTATE_MAX=21011

I've tracked the usage with:
ipfstat -sl | grep '^[^[:space:]]' | wc -l

I've seen it go up to 11k or so.

--
Stephen

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Tuan Nguyen

2010-10-25 10:33:49 UTC

Permalink

Post by Stephen Borrill

Post by Joerg Sonnenberger

Post by Manuel Bouyer
Did you check if squid is hitting some ressource limit, maybe
file descriptors ?

I'm not seeing any error in squid log that indicates these limits has
been reached (something like "Warning your cache is running out of file
descriptors" IIRC). We did increase this limit in squid but that doesn't
help.

Post by Stephen Borrill

Post by Joerg Sonnenberger
Or sockets in time wait state.

I've run netstat at multiple sites when the error occured but the number
of local sockets in use seemed to vary greatly. If this is the case,
what could we do?

Post by Stephen Borrill
Tuan (my co-worked) will correct me if I'm wrong, but it's proving to
be an ipfilter problem. With ipfilter disabled, there are literally
zero errors (we did up file descriptors to 8192 BTW).
On a related issue, we are also seeing very slow rsync transfers under
some circumstances. Again these are fixed by disabling ipfilter.
The ipf.conf files are quite complex and are built up automatically
- ip address and netmasks
- firewall security level
- NAT on/off
I've recommended to start from a simple ipf.conf (basically just pass
in all/pass out all) and build up from there to see if we can work out
what triggers the problem.

Oddly enough, as Stephen mentioned, disabling ipfilter seems to make the
error go away completely. Could it be our firewall is dropping packets
which makes squid think its parent proxy is dead?
I'm working on a building up the firewall from a minimal set of rules to
see if this is the case.

Digging into tech-net's archive from last year I found this post which
looks like it might be related. Any thoughts on this?
http://mail-index.netbsd.org/tech-net/2009/11/17/msg001719.html

Below is a TCP conversation between squid with the upstream proxy during
busy traffic. The part that's puzzling me is I'm seeing (repeated)
identical RST packets (3 or 4 per TCP connection) sent by squid to the
upstream proxy.

14:18:25.532933 IP netbsd.57601 > upstream.3129: S
3791930686:3791930686(0) win 32768 <mss 1460,nop,wscale
3,sackOK,nop,nop,nop,nop,timestamp 1 0>
14:18:25.538322 IP upstream.3129 > netbsd.57601: S
1766808328:1766808328(0) ack 3791930687 win 5792 <mss
1460,sackOK,timestamp 3382782028 1,nop,wscale 2>
14:18:25.538353 IP netbsd.57601 > upstream.3129: . ack 1 win 4197
<nop,nop,timestamp 1 3382782028>
14:18:25.538492 IP netbsd.57601 > upstream.3129: P 1:816(815) ack 1 win
4197 <nop,nop,timestamp 1 3382782028>
14:18:25.545471 IP upstream.3129 > netbsd.57601: . ack 816 win 1856
<nop,nop,timestamp 3382782034 1>
14:18:26.387247 IP netbsd.57601 > upstream.3129: F 816:816(0) ack 1 win
4197 <nop,nop,timestamp 3 3382782034>
14:18:26.434730 IP upstream.3129 > netbsd.57601: . ack 817 win 1856
<nop,nop,timestamp 3382782923 3>
14:18:28.606196 IP upstream.3129 > netbsd.57601: . 1:1449(1448) ack 817
win 1856 <nop,nop,timestamp 3382785095 3>
14:18:28.606197 IP upstream.3129 > netbsd.57601: P 1449:1553(104) ack
817 win 1856 <nop,nop,timestamp 3382785095 3>
14:18:28.606199 IP upstream.3129 > netbsd.57601: . 1553:3001(1448) ack
817 win 1856 <nop,nop,timestamp 3382785095 3>
14:18:28.606225 IP netbsd.57601 > upstream.3129: R
3791931503:3791931503(0) win 0
14:18:28.606238 IP netbsd.57601 > upstream.3129: R
3791931503:3791931503(0) win 0
14:18:28.606249 IP netbsd.57601 > upstream.3129: R
3791931503:3791931503(0) win 0

Thanks for all your help so far.

Kind regards,
Tuan

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de

Matthias Scheler

2010-10-23 15:53:13 UTC

Permalink

Post by Tuan Nguyen
We're experiencing this strange issue with squid on NetBSD (5.1 RC3).

Which version of Squid are you using?

Kind regards

--
Matthias Scheler http://zhadum.org.uk/

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de