NGINX for Load Balancing

5 min readOct 11, 2020

Load balancing is an excellent way to scale out your application and increase its performance. It distributes the incoming traffic across multiple servers configured behind it. NGINX acts as a single entry point to a distributed web application working on multiple separate servers.

Redirecting traffic to a Group of servers

Assumption: I’m assuming that you have NGINX installed on your local machine and it is up and running. If you need help installing NGINX, you can follow this link.

Note: Every time you make changes to the nginx.conf file, you need to reload your nginx server. You can do this by running the following command:

nginx -s reload

To start using NGINX to load balance HTTP traffic to a group of servers, you first need to access your nginx.conf file that will be present at the path /usr/local/etc/nginx/nginx.conf or wherever your nginx is installed. Once you’ve opened the file, you need to define the upstream directive. This directive is placed in the http context. Upstream block basically defines a group of servers listening on different ports. The server directive is used to configure this group of servers. For example, the following configuration defines a group named go-backend and consists of three server configurations (which may resolve in more than three actual servers):

events { }http {
     upstream go-backend {
         server localhost:8001;
         server localhost:8002;
         server localhost:8003;
     }
}

As you can see above, the group consists of three servers, running on the same local machine on different ports i.e 8001, 8002, 8003.

To pass requests to a server group, the name of the group is specified in the proxy_pass directive. Following is the modified code containing the server and upstream block. Basically, the code below proxies the incoming HTTP requests to the go-backend server group. Because no load‑balancing algorithm is specified in the upstream block, NGINX uses the default algorithm, Round Robin.

events { }http {
     upstream go-backend {
         server localhost:8001;
         server localhost:8002;
         server localhost:8003;
     }
   
     server {
         listen 9001;
         server_name load-balancing.com;
              
         location / {
             proxy_pass http://go-backend;
         }
     }
}

The server is listening on port 9001 on your local machine. So, if you hit a curl on port 9001, nginx will redirect your request to the group of servers running on ports 8001, 8002 and 8003 in Round Robin manner by default.

Also, if you want to create NGINX logs, you can do this by using the directives access_path and log_format inside http block.

events { }http {
 log_format upstreamlog '$server_name to: $upstream_addr [$request]'
    'upstream_response_time $upstream_response_time '
    'msec $msec request_time $request_time';   upstream go-backend {
       server localhost:8001;
       server localhost:8002;
       server localhost:8003;
   }
   
   server {
       listen 9001;
       server_name load-balancing.com;
       access_log /usr/local/etc/nginx/access.log upstreamlog;         
   
       location / {
           proxy_pass http://go-backend;
       }
   }
}

So, your NGINX logs will be created at the path /usr/local/etc/nginx/access.log and if you’ve followed the format mentioned in the nginx.conf file above, the logs will look something like this:

load-balancing.com to: [::1]:8002 [GET /hello HTTP/1.1] upstream_response_time 0.001 msec 1602434157.021 request_time 0.001

Load-Balancing Methods

Load balancing with NGINX uses a round-robin algorithm by default if no other method is defined in the upstream block, like in the example above. With round-robin scheme each server is selected in turns according to the order you set them in the nginx.conf file. This balances the number of requests equally for short operations.

Round Robin

NGINX distributes requests among servers in the group according to their weights using the Round Robin method. It can be clubbed with server weights technique as described in point number four.

upstream go-backend {
    server localhost:8001;
    server localhost:8002;
}

2. Least Connections

As the name suggests, this method directs the requests to the server with the least active connections at that time. It works more fairly than round-robin would with applications where requests might sometimes take longer to complete.

upstream go-backend {
    least_conn;
    server localhost:8001;
    server localhost:8002;
}

3. IP Hash

If your web application requires that the users are subsequently directed to the same back-end server as during their previous connection, use IP hashing method instead. IP hashing uses the visitors IP address as a key to determine which host should be selected to service the request. This allows the visitors to be redirected to the same server, provided that the server is available and the visitor’s IP address hasn’t changed.

upstream go-backend {
    ip_hash;
    server localhost:8001;
    server localhost:8002;
}

4. Server Weights

In a server setup, where the available resources between different hosts are not equal(maybe one server is more powerful than other, for example), it might be desirable to favour some servers over others. Defining server weights allows you to further fine-tune load balancing with NGINX. The server with the highest weight in the load balancer is selected the most often.

upstream go-backend {
    server localhost:8001 weight=4;
    server localhost:8002 weight=2;
    server localhost:8003;
}

In the configuration shown above, the first server is selected twice as often as the second, which again gets twice the requests compared to the third.

Now, I have slightly modified my nginx.conf file to set the headers in the request and display those headers in my application server as a way to see where actually the request is getting redirected to. The modified nginx.conf file is given below:

events { }http {
 log_format upstreamlog '$server_name to: $upstream_addr [$request]'
      'upstream_response_time $upstream_response_time '
      'msec $msec request_time $request_time';   upstream go-backend {
       server localhost:8001;
       server localhost:8002;
       server localhost:8003;
   }
   
   server {
       listen 9001;
       server_name load-balancing.com;
       access_log /usr/local/etc/nginx/access.log upstreamlog;         
   
       location / {
           proxy_pass http://go-backend;
       }

       proxy_set_header Host $host;
       proxy_set_header X-Forwarded-Host $server_name;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;   }
}

I think we’re pretty much done with our nginx.conf file now. Now to test this, we can open 4 terminals on our local machine. Set the servers running on port 8001, 8002 and 8003 in the three terminals and use one terminal to hit your application. Please have a look at the image below to get a feel of what I’m trying to say.

So, I’m hitting the curl http://localhost:9001/hello. I have this api configured on my application server. As you can see, the request is being directed to different servers every time I hit the curl. It simply displays the request headers and returns Hello world as a json.

That was all about load balancing through NGINX. I hope you had fun reading this :)

References:

NGINX Docs | HTTP Load Balancing

Load balancing across multiple application instances is a commonly used technique for optimizing resource utilization…

docs.nginx.com

NGINX for Load Balancing

Redirecting traffic to a Group of servers

Load-Balancing Methods

NGINX Docs | HTTP Load Balancing

Load balancing across multiple application instances is a commonly used technique for optimizing resource utilization…

Written by Aman Jain