Tuning Apache httpd and Tomcat to work well together
Archive - Originally posted on "The Horse's Mouth" - 2010-10-27 17:06:16 - Graham EllisIf you're running Tomcat as your servlet / JSP container, you're more than likely to have it (or them - you may have multiple instances) running as application servers behind Apache httpd (the Http server) or behind a pair of Http servers. That just like having a team of people providing a service (let's say fitting car tyres/tires at Carson Tyre and Autocare in Melksham, but you being interfaced to them by the guy on reception who knows how to check you in, what product to recommend you to, and whether to forward you to the tires / exhausts / batteries team(s) (which team is free at the moment?) or answer the query himself.
 It would be inefficient - and very frustrating to customers - if the receptionist queued up large numbers of customers in his waiting room, with three or four people waiting with decreasing patience for a team or one of a group of teams to become available.  And it would be equally inefficient for him to keep everyone waiting outside until a team was completely ready for its next job, and only accepting someone in through the front door at that point.  Even if both reception and the teams had plenty of capacity, it could be set up in such a way that capacity was squandered.  Of course, Carson Tyre have it right - I have always been impressed with a seamless passover, fast service and a sensible deal -  but how do you translate the same principles to httpd and Tomcat on your web site?
It would be inefficient - and very frustrating to customers - if the receptionist queued up large numbers of customers in his waiting room, with three or four people waiting with decreasing patience for a team or one of a group of teams to become available.  And it would be equally inefficient for him to keep everyone waiting outside until a team was completely ready for its next job, and only accepting someone in through the front door at that point.  Even if both reception and the teams had plenty of capacity, it could be set up in such a way that capacity was squandered.  Of course, Carson Tyre have it right - I have always been impressed with a seamless passover, fast service and a sensible deal -  but how do you translate the same principles to httpd and Tomcat on your web site?Firstly, you need to understand your predicted traffic a bit, with special attention given to peak times. There isn't necessarily an easy answer here, but monitoring your existing servers through various tools such as:
• Tomcat Manager
• Apache httpd server status
• JConsole [link]
• top and free (on Linux / Unix)
• hpjtune - [link]
will give you some good metrics as to current loading. Short scripts such as [this one] when run under crontab, and logging from httpd or from Tomcat (with log4j) will allow you to build up cumulative information that you can analyse witha variety of tools, and you can build a picture of how things run over time - with suitable tools in a language such as Perl or PHP to analyse the datalogs you'll build up. There are also tools to apply test loads to your server:
• ab (ApacheBench) [link]
• Jmeter [link]
You need to be careful with both of these that you're going to apply realistic loads; loads tend to be simplistic when compared to a real load, and can be intrusive on a server - you won't be popular with the people who's site is hosted on your server if you accidentally simulate a denial of service attack. You can also write your own load simulator - I've posted a Perl, forked example that uses ab internally [here] - it's running 25 different requests at the same time, and running each 4 times, starting new requests from a long list of suitable URLs held in a file (which was probably extracted from a server log to get the current traffic balance!)
Having understood your traffic, you now need to look at how it's going to flow through between the front end server (httpd / receptionist) and back server(s) (tomcat / tyre teams). For the sake of simplicity, I'm going to describe the following technique for a single httpd, a single Tomcat, and for a system where virtually all the traffic is passed through from httpd to Tomcat, with no additional connectors into Tomcat carrying significant traffic levels.
Method:
1. Choose a maximum number of threads for Tomcat (we will come back and scale this later
2. Set an acceptCount on the connector to around 25% of the maximum threads figure so that you'll have a proportion of jobs waiting at Tomcat for threads to be freed up by previous jobs at busy times
3. Set the number of servers in httpd to be very slightly less than the sum of the maximum number of threads that Tomcat runs PLUS it's accept count.
So ... set Tomcat to 200 threads, accept count of 50, and the number of httpd servers to 240 for example.
You now have the servers in balanced proportions - there's no way (in this simple case) that requests will be queued in httpd just to find they're being rejected after they've been through the receptionist because there's no tire crew available. And there's no way that the receptionist will be turning people away even though there's spare capacity in the tire crews.
Now - it's all very well having things in the right proportion, but do we have plenty of spare space in our workshop in which we could deploy more tyre crews (and do we need to?), or do we have more crews than space, so that they're constantly tripping over each other, having to borrow resources, and are thus slowing the whole process down?
4. If you're running a "vanilla" Tomcat, it'll be limited to a tiny 64Mbytes of memory for the JVM; in effect, you may have a web server with 2 Gbytes or more of real memory, and be using around 5% of it for the real work. Increase it ... a good first guess may be 256Mbtes ... [see how].
5. Monitor your servers and learn what's going on under a normal load. Use the various tools I've listed above. And from what you learn, you can make adjustments ...
5a. Increase / decrease the memory available to Tomcat. But don't let it get to swapping all the time. Little point in fighting for memory!
5b. Scale the thread counts up and down as appropriate within the proportions already calculated so that the cpu(s) doesn't / don't overload. Little point in fighting for the cpu!
5c. There are other tailorings that can be made too within the JVM, such as how much space is given to each of the memory areas, which Garbage Collector is used to gather up memory that's available for reuse, when the full garbage collector runs and when the scavenger (quick pass through, Eden space only) runs. In many circumstance you'll find that the best selection algorithm to use is the one that Tomcat and the JVM will select by default anyway. One piece of advise, though, is to set the JVM to run -server rather than the default -client - you get a faster runtime at the expense of a slower start up. If you really want to get into the JVM tuning further, have a look [here].
5d. The httpd daemon can run with two models - "prefork MPM and worker MPM. The prefork model starts a number of processes when the server starts, and adds to that number as load increases until the maximum set in step 2 is reached. The number is reduced again as / when / if the server gets quieter (see previous blog). The worker model also starts multiple processes, but each can handle a fixed number of connections at a time, and when a process gets full a fresh one is started for a fresh batch of 25 (or however many) extra connections. The worker model, with its higher individual process size, is designed for many-processor machines and you'll want to tune it to having a process running on most processors in parallel. The prefork model, on the other hand, is very much better tuned for single processor machines. See [Annotate example file].
5e. You'll note that both httpd and Tomcat give you the abiilty to set the number of requests each thread / process can handle sequentially, and you're given the option to turn that limit off and have them go on for ever. Why would you want to stop and restart an identical process? Well - it would NOT be identical. Threads grow in size, especially when they get a particularly heavy request (e.g. a travel routing site is asked for the best route from Melksham to Kinlochbervie) and that memory used won't be released back to the operating system while the process remains alive; the occasional killing off of big old threads and their replacement by smaller children is a positive move. You'll also hear the term "memory leaks" used, and this facility of refreshing threads being described as helping to prevent problems caused by such memory leaks; it may be that certain servlet apprications do grow / not release memory when they should - such are often obsure coding errors.
5f. You may wish to switch the connector between httpd and Tomcat. When I first started giving courses on the servers, which connector to use was a huge subject - jk, jk2, warp, jserv, proxy. These days (Apache 2.2) mod_proxy is built in, it has a load balancer capability (and a good, tunable one) if you're going to run multiple tomcats, and you can select which protocol to use too - the "standard" http, the more compact ajp, or the more secure https. And just before you decide to use https because of the type of traffic you're carrying, think if you really need that extra security between what are likely to be two processes on the same box which is probably sitting behind a firewall anyway.
5g. There are other settings in the "extra/default" file of Apache httpd which you might want to look at - such as altering the timeout for connections - see [previous article] for some details. I would usually recommend that you have keepalive on to avoid huge numbers of connections making and breaking (exception - if visitors are all onehit wonders to single files on something like a .pdf repository). But don't make the Keepalivetimeout too long of you'll tie up resources "just in case" someone comes back a long while later.
There are no standard rules to apply in balancing your load, and I'll advise you to put some load monitoring tools in place so that you can keep an eye on your running system. We keep Apache httpd log files (which show all the traffic to our Tomcat too, of course) which we analyse daily and can troubleshoot if there's a specific incident, and we have a crontab job which collects data from the system and web servers every five minutes and logs that too. Finally, we have heartbeat scripts which run from once an hour to much more often (depending on the server) which trigger action should they fail.
6. Where you have a number of Java applications running on the same system, you can choose whether to run them all on the same Tomcat or have multiple copies. Using a single Tomcat means that memory that's used in a busy period on one applcation can then be used by another application within the JVM when the first application has gone quiet, and that any one application can grab much more resource if it peaks. However, within a single Tomcat there's a risk of one application interfering with another by grabbing all the resources, and if you need to restart your Tomcat because of one applcation you'll have to "bounce" the other one too.
There are further aspects to take into account as you tune things too. It might be that your Apache http server directly handles a significant proportion of traffic itself, just passing the meatier processing jobs on to Tomcat; there's really little point in asking Tomcat for flat, unchanging CSS file, images .pdfs, for example ... after all, our garage recpetionist isn't going to call up a tyre crew just to tell a customer who asks what time they close. And if you've got a significant proportion of such traffic, you may want to proportionalty scale up the http threads / processes.
With multiple systems running Tomcat, with multiple (perhaps load balanced / potentially hotswapped) httpds feeding into each of them, the interaction of the variables across the systems gets to be something you must think of carefully, and you need to consider whether to use by-request, by-traffic or by-queue-size balancing and other extra parameters too, but the basics outlined earlier in this article should stand you in good stead.