One of the common tips to increasing Apache performance is to turn off the per-directory configuration files (aka .htaccess files) and merge them all into your main Apache server configuration file (httpd.conf).
Jeremy raised an interesting question about when the performance loss caused by using many htaccess files is offset by the ease of maintenance. He’s arguing – and I agree – that it makes sense to keep the configuration locally inside .htaccess files, despite the performance loss as these are easier to maintain.
It’s fairly logical that the multiple .htaccess file route will be slower – for every node in the request URI, the webserver has to look for an .htaccess file and merge the rules found in every one. So, we’re going to have to have a filesystem seek’n’read for every subdirectory.
However, is this a major issue? How much of a performance hit is there? Let’s find out:
Ok. Let’s make two docroots each with the same structure and files.
htdocs_access – the .htaccess version. This has one .htaccess file in the leaf directory.
htdocs_config – the httpd.conf version. This has the same rule as the above, but the rule is in the server-wide httpd.conf file and htaccess support is turned OFF (AllowOverride None).
Next, we need to get the .htaccess/httpd.conf files to do something ( mainly so we can see if Apache’s merged them in ). So, we’ll make a number of files in the last random directory (the leaf node), and give half of them the extension .foo, and the other half .bar. We’ll then tell Apache to process the .bar’s with PHP, and the .foo’s as text. All files will have the same content:
Here’s the (python) code I used to generate this structure:
import os # where we'll place the generated structure staging = '/Users/simon/server' htdocs_access = os.path.join(staging, 'htdocs_access') htdocs_config = os.path.join(staging, 'htdocs_config') # how deep to go! dir_depth = 10 # how many files in the leaf node of the dir. num_files = 50 # what content to put in the files content = "" # the actual htaccess file htaccess = "" AddHandler application/x-httpd-php .bar "" # make directory structure dir = "" for dirnum in range(0, dir_depth): dir = os.path.join( dir, str( dirnum ) ) hta = os.path.join( htdocs_access, dir ) htc = os.path.join( htdocs_config, dir ) os.makedirs( hta ) os.makedirs( htc ) # make the files: for filenum in range( 0, num_files ): # assign the file types -- half .foo, and half .bar if filenum % 2 == 0: filename = '%d.foo' % filenum else: filename = '%d.bar' % filenum f = open( os.path.join( hta, filename ), 'w+' ) f.write( content ) f.close() f = open( os.path.join( htc, filename ), 'w+' ) f.write( content ) f.close() # now, add the .htaccess file inside the lead htdocs_access dir f = open( os.path.join( hta, '.htaccess' ), 'w+' ) f.write( htaccess ) f.close() # and we'll place it in the root of the htdocs_config dir as # httpd.conf to remind ourselves to add it to the httpd.conf file f = open( os.path.join( htdocs_config, 'httpd.conf' ), 'w+' ) f.write( htaccess ) f.close()
Here’s what we end up with:
0/ 1/ 2/ 3/ 4/ 5/ 6/ 7/ 8/ 9/ 0.foo 1.bar 10.foo 11.bar (...etc...) 6.foo 7.bar 8.foo 9.bar
Where htdocs_access has a .htaccess file in 9/ and htdocs_config doesn’t.
Here are the two httpd.conf files for the configurations:
ServerRoot "/usr/local/apache2" PidFile logs/httpd.pid Timeout 300 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 15 DirectoryIndex index.html AccessFileName .htaccess HostnameLookups Off AcceptMutex fcntl StartServers 5 MinSpareServers 5 MaxSpareServers 5 MaxClients 100 MaxRequestsPerChild 10 User nobody Group #-1 DocumentRoot "/Users/simon/server/htdocs_config" LoadModule php5_module modules/libphp5.so Listen 8111 <directory> Options Indexes FollowSymLinks AllowOverride None AddHandler application/x-httpd-php .bar </directory>
ServerRoot "/usr/local/apache2" PidFile logs/httpd.pid Timeout 300 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 15 DirectoryIndex index.html AccessFileName .htaccess HostnameLookups Off AcceptMutex fcntl StartServers 5 MinSpareServers 5 MaxSpareServers 5 MaxClients 100 MaxRequestsPerChild 10 ### Section 2: 'Main' server configuration User nobody Group #-1 DocumentRoot "/Users/simon/server/htdocs_access" LoadModule php5_module modules/libphp5.so Listen 8111 <directory> Options Indexes FollowSymLinks AllowOverride All </directory>
Benchmarking was done with “ab” the Apache Benchmark program, which was set to access one page 1,000 times with 10 concurrencies. Each configuration was benchmarked five times in random order (to minimise the effect of any running background processes etc).
<th colspan="2"> htdocs_config </th> <th colspan="2"> htdocs_access </th>
<th> Time Taken (s): </th> <th> Requests per Second: </th> <th> Time Taken (s): </th> <th> Requests per Second: </th>
<td> 12.683213 </td> <td> 78.84 </td> <td> 13.21618 </td> <td> 75.66 </td>
<td> 12.854491 </td> <td> 77.79 </td> <td> 13.574916 </td> <td> 73.67 </td>
<td> 11.777676 </td> <td> 84.91 </td> <td> 13.163296 </td> <td> 75.97 </td>
<td> 13.668398 </td> <td> 73.16 </td> <td> 12.26475 </td> <td> 81.53 </td>
<td> 13.76753 </td> <td> 76.47 </td> <td> 13.264527 </td> <td> 75.39 </td>
<th> 12.9 </th> <th> 78.23 </th> <th> 13.1 </th> <th> 76.4 </th>
So – we’re looking at a difference of around 2.3% extra requests per second when htaccess files are disabled. This is really quite trivial, and should only be worried about when you’re really loaded.
There are a number of areas where this could be improved:
- Try different directory depths i.e. the more nested the directory is, the slower it should be under the .htaccess scenario. In contrast, if there’s only 2 or 3 levels then it should be faster.
- Have multiple .htaccess files in the intermediate nodes to see how Apache handles the merging of these files. Here we’ve just used one .htaccess file, and we should probably see further slowdowns if Apache has to merge some complicated rule sets.
- Access different files – I just requested one file repeatedly, so we might be getting a lot of interference from any caching systems (harddrive, ram, php caches etc) that I forgot about. Additionally, requesting multiple URI’s is a more realistic test case for a webserver.