.htaccess

From Wiki Notes @ WuJiewen.com, by Jiewen Wu
Revision as of 19:16, 7 March 2009 by Admin (talk | contribs) (Is .htaccess enabled?)

Jump to: navigation, search

.htaccess tips and tricks

Important Notes

This article was originally posted here. For simplicity and readability, some figures and paragraphs are not included in this wiki note.

Introduction

This work in constant progress is some collected wisdom, stuff I've learned on the topic of .htaccess hacking, commands I've used successfully in the past, on a variety of server setups, and in most cases still do. You may have to tweak the examples some to get the desired result, though, and a reliable test server is a powerful ally, preferably one with a very similar setup to your "live" server. Okay, to begin..

There's a good reason why you won't see .htaccess files on the web; almost every web server in the world is configured to ignore them, by default. Same goes for most operating systems. Mainly it's the dot "." at the start, you see?

If you don't see, you'll need to disable your operating system's invisible file functions, or use a text editor that allows you to open hidden files, something like bbedit on the Mac platform. On windows, showing invisibles in explorer should allow any text editor to open them, and most decent editors to save them too**. Linux dudes know how to find them without any help from me.

  • Even notepad can save files beginning with a dot, if you put double-quotes around the name when you save it; i.e.. ".htaccess". You can also use your ftp client to rename files beginning with a dot, even on your local filesystem; works great in FileZilla.

What are htaccess files

Simply put, they are invisible plain text files where one can store server directives. Server directives are anything you might put in an Apache config file (httpd.conf) or even a php.ini**, but unlike those "master" directive files, these .htaccess directives apply only to the folder in which the .htaccess file resides, and all the folders inside.

This ability to plant .htaccess files in any directory of our site allows us to set up a finely-grained tree of server directives, each subfolder inheriting properties from its parent, whilst at the same time adding to, or over-riding certain directives with its own .htaccess file. For instance, you could use .htacces to enable indexes all over your site, and then deny indexing in only certain subdirectories, or deny index listings site-wide, and allow indexing in certain subdirectories. One line in the .htaccess file in your root and your whole site is altered. From here on, I'll probably refer to the main .htaccess in the root of your website as "the master .htaccess file", or "main" .htaccess file.

There's a small performance penalty for all this .htaccess file checking, but not noticeable, and you'll find most of the time it's just on and there's nothing you can do about it anyway, so let's make the most of it..

  • Your main php.ini, that is, unless you are running under phpsuexec, in which case the directives would go inside individual php.ini files

Is .htaccess enabled?

It's unusual, but possible that .htaccess is not enabled on your site. If you are hosting it yourself, it's easy enough to fix; open your httpd.conf in a text editor, and locate this <Directory> section..

Your DocumentRoot may be different, of course..

   # This should be changed to whatever you set DocumentRoot to.
   #
   <Directory "/var/www/htdocs">
   #


..locate the line that reads..

   AllowOverride None


..and change it to..

   AllowOverride All


Restart Apache. Now .htaccess will work. You can also make this change inside a virtual host, which would normally be preferable.

If your site is hosted with someone else, check your control panel (Plesk. CPanel, etc.) to see if you can enable it there, and if not, contact your hosting admins. Perhaps they don't allow this. In which case, switch to a better web host.

Control access

.htaccess is most often used to restrict or deny access to individual files and folders. A typical example would be an "includes" folder. Your site's pages can call these included scripts all they like, but you don't want users accessing these files directly, over the web. In that case you would drop an .htaccess file in the includes folder with content something like this..

NO ENTRY!

   # no one gets in here!
   deny from all


which would deny ALL direct access to ANY files in that folder. You can be more specific with your conditions, for instance limiting access to a particular IP range, here's a handy top-level rule for a local test server..

NO ENTRY outside of the LAN!

   # no nasty crackers in here!
   order deny,allow
   deny from all
   allow from 192.168.0.0/24
   # this would do the same thing..
   #allow from 192.168.0


Generally these sorts of requests would bounce off your firewall anyway, but on a live server (like my dev mirror sometimes is) they become useful for filtering out undesirable IP blocks, known risks, lots of things. By the way, in case you hadn't spotted; lines beginning with "#" are ignored by Apache; handy for comments.

Sometimes, you will only want to ban one IP, perhaps some persistent robot that doesn't play by the rules..

post user agent every fifth request only. hmmm. ban IP..

   # someone else giving the ruskies a bad name..
   order allow,deny
   deny from 83.222.23.219
   allow from all


The usual rules for IP addresses apply, so you can use partial matches, ranges, and so on. Whatever, the user gets a 403 "access denied" error page in their client software (browser, usually), which certainly gets the message across. This is probably fine for most situations, but in part two I'll demonstrate some cooler ways to deny access.