iPerSec
internet Performance & Security

HTTP Accelerators

octobre 19th, 2007 by jfbus

If you have already used apache on a medium/heavily loaded website, you already know the limitations of the pre-fork model traditionnaly used by apache. Each client connection uses a dedicated apache process, and a dedicated PHP interpreter, each process consuming a couple Megs (50-150MB of RAM for large PHP apps) of RAM.

When using KeepAlives (KeepAlive On), apache processes are mostly idle, waiting for the client to send another HTTP request. When not using KeepAlives (KeepAlive Off), page load times are slower, because each object requires a new TCP connection to be opened - and the server can be overloaded by the number of TCP open/closes.

Alternatives

Newer versions of apache introduce better MPMs, that do not use the “each connection uses a dedicated process” model any more, enabling you to use KeepAlive without eating too much RAM. The worker MPM uses multiple processes/multiple threads, and the event MPM a single, event-driven process. If you plan to use one of these, be aware that using PHP with the worker MPM can be dangerous, as some librairies used by PHP extensions might not be thread-safe, and that the event MPM is experimental.

There are also better webservers, like lighttpd - single process, event driven and high-performance, using a separate pool of FastCGI PHP processes. These setups are great for simple to medium-complexity setup, but miss some useful functions (ie : per-virtualhost php configuration, complex authentication mecanisms).

You can get the best of both worlds using an HTTP accelerator : a simple, high-performance reverse proxy located in front of your apache setup.

The benefit of this setup is :

  • the HTTP accelerator handles all the TCP connection part, and is able to manage thousands (tenth of thousands) of TCP connexions (when apache is limited to a couple of hundreds on standard hardware),
  • the accelerator can proxy content (serve static content without having to ask the backend server for it), and sometimes compress it,
  • the accelerator can load-balance HTTP requests to multiple back-end servers,
  • the accelerator multiplexes HTTP requests to a small number of apache processes, each working at full rate,
  • you still benefit from apache’s flexibility for your back-end setup.

Types of HTTP accelerators

HTTP accelerators can be :

  • hardware based : most load-balancers (F5 BIG-IP, Citrix NetScaler, …) include HTTP acceleration functions (sometimes as an option); there are also dedicated boxes (Juniper DX). Loadbalancers are usualy complex to setup, Juniper boxes very easy, but all of them are quite expensive.
  • software based, like squid and varnish, both open-source packages.

squid is complex (some would say a little bit bloated), but very powerful - it can be used a for a forward proxy as well as an HTTP accelerator. varnish is simple, extremely performant, but can only be used as an HTTP accelerator (which is not a problem in our case).

Sample setup

Here is a simple, single-server setup :

  • the HTTP accelerator and the backend server are both on the same box,
  • the accelerator listens to port 80, and the back-end server to port 8080.

Apache configuration :
Listen 8080
[...]
NameVirtualHost *:8080
[...]
<VirtualHost *:8080>
[...]
</VirtualHost>

Varnishd startup :
varnishd -b localhost:8080

I’ll talk a little bit more about varnish in the future.

Drawbacks

The back-end webserver does not see the remote IP address anymore, it only sees the IP of the HTTP accelerator (127.0.0.1). Some HTTP accelerator can add the remote IP in a new HTTP header, otherwise, you can get the remote IP from the X-Forwarded-For header :
if ($_SERVER["REMOTE_ADDR"] == “127.0.0.1″) {
$ipArr = explode(”, “, $_SERVER["HTTP_X_FORWARDED_FOR"]);
$_SERVER["REMOTE_ADDR"] = (count($ipArr) > 1 ? $ipArr[1] : $ipArr[0]);
}

Posted in HTTP, Tuning

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.