Troubleshooting Load Balancer Backend Server Issues
Learn about backend server issues associated with load balancers.
Debugging a Backend Server Timeout
When the backend server exceeds the response time when responding to a request, a 504
error occurs indicating that the backend server is either down or not responding to the
request forwarded by the load balancer. The client application receives the following
response code: HTTP/1.1 504 Gateway Timeout.
Errors can occur for the following reasons:
The load balancer failed to establish a connection to the backend server before
the connection timeout expired.
The load balancer established a connection to the backend server but the backend
did not respond before the idle timeout period elapsed.
The security lists or network security groups for the subnet or the VNIC did not
allow traffic from the backends to the load balancer.
The backend server or application server failed.
Follow these steps to troubleshoot the backend server timeout errors:
Use the curl utility to directly test the backend server from a
host in the same network.
curl -i http://backend_ip_address
If this test takes longer than one second to respond, an application-level issue
is causing latency. Oracle recommends that you check any upstream dependencies
that might cause latency, including:
Network attached storage such as iSCSI or NFS
Database latency
An off-premise API
An application tier
Check the application by accessing it directly from the backend server. Check its
access logs to determine if the application can be accessed and is functioning
properly.
If the load balancer and the backend server are in different subnets, then check
whether the security lists contain rules to allow traffic. If no rules exist,
then traffic is not allowed.
Enter the following commands to determine whether firewall rules exist on the
backend servers that block traffic:
iptables -L lists all firewall rules enforced by
iptables
sudo firewall-cmd --list-all lists all firewall rules enforced
by firewalld
Enable logging on the load balancer to determine whether the load balancer or the
backend server is causing the latency.
Testing TCP and HTTP Backend Servers 🔗
This topic describes how to troubleshoot a load balancer connection. The topology used in
this procedure has a public load balancer in a public subnet and the backends are in the
same subnet.
Oracle recommends that you use the Oracle Cloud Infrastructure Logging service to troubleshoot issues. (See Details for Load Balancer Logs.)
In addition to using Oracle Cloud Infrastructure logging, however, you can use other utilities listed in this section to troubleshoot the traffic that is processed by the load balancer and sent to a backend. To perform these tests, Oracle recommends that you create an instance in the same network as your load balancer and allow the traffic in the same network security groups and security lists. Use the following tools to troubleshoot:
ping
Before using the more advanced utilities listed here, Oracle recommends that you
perform a basic ping test. For this test to succeed, you must
allow ICMP traffic between the test instance and the
backend.
$ ping backend_ip_address
The response should look similar
to:
PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data.
64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.028 ms
64 bytes from 192.0.2.2: icmp_seq=2 ttl=64 time=0.044 ms
If you receive a message that contains "64 bytes from...", then the ping
succeeded.
Receiving a message that contains "Destination Host Unreachable" indicates that
the system does not exist.
Receiving no message indicates that the system exists but the ICMP protocol is
not allowed. Check all firewalls, security lists, and network security groups to
ensure ICMP is allowed.
curl
Use the curl utility to send HTTP requests to a specific host,
port, or URL.
The following example shows using curl to connect to a
backend that is sending a 403 Forbidden error:
In the preceding example, the health check fails, returning a 403 error, indicating that the backend does not have local file permissions configured properly for the Health check page.
The following example shows using curl to connect to a
backend that is sending a 404 Not Found error:
$ curl -I http://backend_ip_address/health
HTTP/1.1 404 Not Found
Date: Tue, 17 Mar 2021 17:47:10 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 3539
Connection: keep-alive
Last-Modified: Tue, 10 Mar 2021 20:33:28 GMT
ETag: "dd3-5b3c6975e7600"
Accept-Ranges: bytes
In the preceding example, the health check fails, returning a 404 error, indicating that the Health check page does not exist in the expected location.
The following example shows a backend that exists and either a network
security group, the security lists, or a local firewall is blocking the
traffic:
Use the tcpdump utility to capture all traffic to a backend to
ensure which traffic is coming from a load balancer and what is being returned
to the load balancer.
sudo tcpdump -i any -A port port src load_balancer_ip_address
11:25:54.799014 IP 192.0.2.224.39224 > 192.0.2.224.80: Flags [P.], seq 1458768667:1458770008, ack 2440130792, win 704, options [nop,nop,TS val 461552632 ecr 208900561], length 1341: HTTP: POST /health HTTP/1.1
OpenSSL
When troubleshooting SSL issues between the load balancer instance and the
backend servers, Oracle recommends using the openssl utility.
This utility opens an SSL connection to a specific host name and port, and
prints the SSL certificate and other parameters.
Other options for troubleshooting issues are:
-showcerts
This option prints all certificates in the certificate chain
presented by the backend server. Use this option to identify issues,
such as a missing intermediate certificate authority
certificate.
-cipher cipher_name
This option forces the client and server use a specific cipher suite
and helps to rule out whether the backend is allowing specific
ciphers.
Netstat
Use the netstat -natp command to ensure that the application
running on the backend server is up and running. For TCP or HTTP traffic, the
backend application, IP address, and port must all be in listen mode. If
the application port on the backend server is not in listen mode, then
the TCP port of the application is not up.
To resolve this issue, ensure that the application is up and running by either
restarting the application or the backend server.