In this article, we explore the four leading software-based reverse proxy solutions and compare features offered by each solution.
Reverse proxy server overview
Reverse proxy servers offer improved security, load balancing, DDoS attack mitigation, content optimization, caching, eliminate single points of failure (high availability), and offer robust performance tuning options. When a reverse proxy proxies a request, it sends the request to a proxied server. Next, the proxied server receives the response and returns the response to the proxy server. Finally, the proxy server forwards the response to the requesting client. If the proxy server has caching enabled, it can return cached content instead of forwarding the request to a proxied server.
Performance considerations aside, it is a recommended best practice for several back-end server technologies, such as Node.js to have a reverse proxy in front of them for security considerations. From a security standpoint, a proxy server running on port 80 or 443 can proxy requests to a back-end server running on a higher port. In Unix/Linux operating systems, the first 1024 ports are restricted to processes running under root (administrator) accounts. By proxying requests, you can run the back-end server as a standard user, which minimizes the damage that can be done to your internal infrastructure. Another benefit of reverse proxy servers is that no information regarding your back-end servers is visible outside your internal network. Proxy servers also tend to be more efficient at establishing TLS/SSL connection handshakes than several dynamic servers, such as Node.js. It is possible to configure a proxy server to support HTTPS and secure HTTP/2 connections and forward the unencrypted requests to proxied servers. This approach enables proxy server(s) to take the burden of encryption off of internal proxied servers, dedicating them to serving content. Lastly, reverse proxy servers give you the ability to limit client requests, limit connections, and to blacklist request malicious or unauthorized clients.
Proxy servers can also help to improve website response times. While servers, such as Node.js excel at efficiently serving dynamic content, they are not typically as efficient at serving or caching static content as a reverse proxy server could be. Some proxy servers can be configured to serve static content, and to only pass requests requiring a dynamic response to proxied server(s). Proxy servers can enable support for more performant protocols, such as HTTP/2, that internal proxied servers may not support. Additionally, all major reverse proxies support multiple load balancing strategies, providing scaling and redundancy options.
Note: Some may differentiate between reverse proxy servers and load balancers, as they perform two distinct tasks. Because reverse proxying and load balancing are supported by the most popular reverse proxy servers, both are consider roles of a proxy server in the context of this article.
Popular reverse proxy software solutions
There are several hardware and software reverse proxy solutions. Four of the most popular software proxy servers today are Apache Traffic Server, HAProxy, NGINX, and Varnish. All four of these proxies are utilized by large organizations with high traffic volumes. While these proxy servers can handle large amounts of traffic, they are also scalable and lean enough to power smaller, low volume sites such as this blog.
- Apache Traffic Server (Apache TS)
- Yahoo released the Apache TS source code to the open source community in 2009. Apache TS has been used internally by Yahoo for some time as both a reverse proxy and load balancer, handling hundreds of terabytes of traffic on a daily basis. According to the project's website, Apache TS is used by several large, high-traffic firms including Akamai, Comcast, GoDaddy, LinkedIn, and Powerhttp. Unlike some newer proxy architectures, Apache TS combines an event-driven model with a multi-threaded processing model for handling incoming requests from a thread pool. Apache TS can be more resource intensive than some of the alternative proxy servers due to its threading model.
- HAProxy is a free, open source proxy server initially released in 2001. HAProxy is well-known for its low memory usage and CPU efficiency. According to HAProxy's website, this popular reverse proxy is used by firms including Airbnb, Alibaba, DISQUS, GitHub, Instagram, Reddit, Stack Overflow, and Twitter. HAProxy is less feature rich than other competing solutions, but this also helps to minimize the resources it requires. In order to reduce the cost of context switching, HAProxy implements a single-process event-driven model similar to Node.js. HAProxy does not create or use a new thread to handle each request or connection, which minimizes HAProxy's resource requirements and improves overall throughput. Some manual configuration is required to optimize HAProxy's performance on multi-core server deployments (see HAProxy best practices as a starting point).
- Initially released in 2004, NGINX is currently the most popular web server and proxy server in deployment for high traffic websites. NGINX is feature rich, and supports acting as an origin server, load balancer, or reverse proxy. NGINX is free and open source. However, some advanced/commercial features are only available in the commercialized NGINX PLUS version of the product. Since NGINX is open source and supports modules, some commercial features have been re-implemented and can be added to a standard NGINX instance. Currently, a basic NGINX PLUS annual support license costs $1,900 per instance (December 31th 2016). NGINX is used by several firms including Alkami, Amazon, Cloudflare, Dropbox, Eventbrite, Hulu, GOV.UK, Groupon, HP, NASA, Pinterest, Sound Cloud, WIX, WordPress.com, and Zappos. Like HAProxy, NGINX employs an event-driven processing model typically with a worker process for each CPU core.
- Varnish is the newest popular reverse proxy server, and was initially released in 2006. Varnish was designed exclusively as HTTP web accelerator and reverse proxy. Varnish is used by high-traffic firms including Facebook, The Guardian, The New York Times, Tumbler, Twitter, and Vimeo. Similar to Apache TS, Varnish utilizes a multi-threaded processing model and assigns one thread pool thread to each incoming HTTP connection. While Varnish is utilized by some large firms, this reverse proxy is not nearly as popular as NGINX or HAProxy and has a smaller community. Varnish appears to currently be in a state of financial and developer support in its latest release notes (see we need more money section).
All four of the leading proxy servers have proven their ability to handle larger traffic volumes in real production environments. Unfortunately, I have been unable to find any legitimate, freely available scientific performance benchmarks for Apache TS vs. HAProxy vs. NGINX vs. Varnish. Before selecting a particular proxy for your specific website needs, it would be prudent to do some internal benchmarking before selecting a reverse proxy based on performance considerations. Due to the unique topologies of websites and cloud services, your performance mileage may vary.
Since session persistence is an important advanced feature when a proxy server load balances request to multiple proxied servers, it may be helpful to read additional details on how each proxy server solution addresses session persistance when load balancing requests to multiple proxied servers. Some starting points are provided below:
- Apache TS
- Apache TS has a comparatively limited load balaner plugin document in Balancer Plugin Reference
Feature comparison matrix
The following table compares high-level features for the four most popular reverse proxy software solutions.
|Administration Console||Yes||Yes||Yes (NGINX Plus version more feature rich)||Yes|
|Dynamic Modules / Plugins||Yes||No||Yes||Yes|
|Health Monitoring||Yes||Yes||Yes (NGINX Plus version more feature rich)||Yes|
|HTTP/2 Support||Yes||Yes||Yes||Yes (Experimental September 2016)|
|Load Balancing - IP-hash||Yes||Yes||Yes||Yes (client director)|
|Load Balancing - Least-connected||No||Yes||Yes||No|
|Load Balancing - Round-robin||Yes||Yes||Yes||Yes|
|Load Balancing - Weighted||No||Yes||Yes||Yes|
|OS - Linux Support||Yes||Yes||Yes||Yes|
|OS - Mac OSX Support||Yes||Unofficially||Yes||Unofficially|
|OS - Windows Support||No||No||Yes (lower performance)||No|
|OS -Free BSD Support||Yes||Yes||Yes||Yes|
|Real-time Statistics||Yes||Yes||Yes (NGINX Plus version more feature rich)||Yes|
|Session Persisteance||No||Yes||Yes (NGINX Plus offers aditional options, thrid-party module available)||Yes|
|SSL Termination / Offloading||Yes||Yes||Yes||No|
|TCP Proxy / Load Balancer||Yes||Yes||Yes||Yes|
|UDP Proxy/ Load Balancer||No||No||Yes (NGINX Plus offers advanced options)||No|
Selecting a reverse proxy
The four more popular reverse proxy solutions are comparable in capabilities and have been proven to be reliable in production environments. Depending on your unique deployment scenario and needs, one particular reverse proxy server solution may be more appropriate. NGINX and HAProxy are the leading reverse proxy servers, and it is sensible for most to chose between one of these leading solutions.
Varnish does not seem to have the same velocity of new development. As mentioned earlier appears, Varnish also appears to be in financial trouble. This, coupled with the fact that Varnish does not offer nearly as many feature as Apache TS, HAProxy, or NGINX makes Varnish a difficult recommendation for new deployment projects.
NGINX vs. NGINX PLUS
For this blog and other projects, I have elected to use NGINX as my reverse proxy solution. NGINX is widely used, has a large community, and offers all the features I require. If you are considering using NGINX for a project, and have not used it in the past, it is worth educating yourself on the differences the free version of NGINX and its commercial counterpart, NGINX PLUS. NGINX contains a subset of features available in NGINX PLUS. Conveniently, NGINX provides a feature comparison matrix on its NGINX and NGINX PLUS feature matrix page.
On occasion, it can be frustrating when a feature has been crippled or eliminated in standard NGINX. There are often alternative ways to obtain missing functionality. For example, Session Persistence is officially only available with NGINX PLUS. However, third-party modules such as the NGINX Sticky Module can be added to standard NGNIX to obtain the same behavior.
My only complaint with NGINX is the limited diagnostic status page it has compared to NGINX PLUS's Live Activity Monitoring Feature. NGINX standard ships with a limited ngx_http_stub_status_module that displays select status information on a spartan status page (plain-text) including: active connections, accepted connections, handled connections, total client requests, reading connections, writing connections, and waiting connections:
Active connections: 62 server accepts handled requests 45687 45687 89789 Reading: 9 Writing: 1 Waiting: 0
While the free Status Module is better than nothing at all, it is significantly lacking when compared to NGINX PLUS'S Live Activity Monitoring module that displays more information in visually appealing manner shown in the screenshots below:
Like most NGINX PLUS features, the same Live Activity Monitoring functionality available in NGINX PLUS can be accomplished with NGINX and a little ingenuity. Since both NGINX and NGINX PLUS provide robust logging configuration options, it is possible to replicate live activity monitoring by using external data visualization tools such as Elastic Stack (Kibana + Elastic Search + Beats + Logstash) or Solarwinds. NGINX is a solid reverse proxy solution, but please consider the differences between NGINX and NGINX Plus before chosing this solution over HAProxy.