I have configured my haproxy.cfg
as follows:
global
chroot /var/run/haproxy
pidfile /var/run/haproxy.pid
maxconn 6000
user haproxy-user
group haproxy-group
daemon
stats socket /var/run/haproxy-admin.sock
defaults
log global
option httplog
option dontlognull
retries 2
timeout connect 4000ms
timeout client 4000ms
timeout server 4000ms
backend app_servers
balance roundrobin
option tcplog
option tcp-check
option httpchk GET /check-status
server app1 10.10.10.10:8080 check
server app2 20.20.20.20:8080 check
Logs that I received:
Server app2 is DOWN, Layer7 timeout, check duration: 2000ms.
Server app2 is UP, Layer7 check passed, code: 200, duration: 150ms.
When the host is marked DOWN, the response is a 504 error:
20.20.20.20 504 POST /service/request
10.10.10.10 200 POST /service/request
My question is, despite setting a timeout of 4000ms, why does the error appear when the response time of the backend server exceeds 2000ms? Can the timeout be adjusted to prevent this error?
The issue you’re experiencing is likely related to the health check timeout, which is separate from the server timeout you’ve set. By default, HAProxy uses a shorter timeout for health checks, often around 2000ms. To resolve this, add a specific ‘timeout check’ directive in your backend configuration.
Try adding this line to your backend section:
timeout check 4000ms
This will align the health check timeout with your other timeouts. Also, ensure your /check-status endpoint responds quickly. If it’s consistently slow, you might need to optimize it or consider using a different health check method.
Remember, while increasing timeouts can prevent false negatives, it’s crucial to balance this with maintaining responsiveness for your users. Monitor your backend performance closely to ensure it’s meeting your service level objectives.
hey there! have u tried adding ‘timeout check 4000ms’ to ur backend config? that might help with the health check timing out too fast. also, double-check ur backend server’s performance - sometimes slow responses can trigger these issues. hope this helps! lemme know if u need more info
hm interesting setup! have u tried adjusting the ‘timeout check’ setting specifically? sometimes that can be different from the other timeouts. also, whats ur server’s typical response time? maybe the 2000ms is an internal limit somewhere? just curious, have u noticed any patterns in when it happens?