Difference between revisions of ".htaccess"

From Wiki Notes @ WuJiewen.com, by Jiewen Wu
Jump to: navigation, search
(Custom error documents)
(Password protected directories)
Line 95: Line 95:
 
Using a custom error document is a Very Good Idea, and will give you a second chance at your almost-lost visitors. I recommend you download mine. But then, I would.
 
Using a custom error document is a Very Good Idea, and will give you a second chance at your almost-lost visitors. I recommend you download mine. But then, I would.
 
==Password protected directories==
 
==Password protected directories==
 +
*Get better protection: The authentication examples above assume that your web server supports "Basic" http authorisation, as far as I know they all do (it's in the Apache core). Trouble is, some browsers aren't sending password this way any more, personally I'm looking to php to cover my authorization needs. Basic auth works okay though, even if it isn't actually very secure - your password travels in plain text over the wire, not clever.
 +
 +
If you have php, and are looking for a more secure login facility, check out pajamas. It's free. If you are looking for a password-protected download facility (and much more, besides), check out my distro machine, also free.
 
The next most obvious use for our .htaccess files is to allow access to only specific users, or user groups, in other words; password protected folders. a simple authorisation mechanism might look something like this..
 
The next most obvious use for our .htaccess files is to allow access to only specific users, or user groups, in other words; password protected folders. a simple authorisation mechanism might look something like this..
  

Revision as of 19:19, 7 March 2009

.htaccess tips and tricks

Important Notes

This article was originally posted here. For simplicity and readability, some figures and paragraphs are not included in this wiki note.

Introduction

This work in constant progress is some collected wisdom, stuff I've learned on the topic of .htaccess hacking, commands I've used successfully in the past, on a variety of server setups, and in most cases still do. You may have to tweak the examples some to get the desired result, though, and a reliable test server is a powerful ally, preferably one with a very similar setup to your "live" server. Okay, to begin..

There's a good reason why you won't see .htaccess files on the web; almost every web server in the world is configured to ignore them, by default. Same goes for most operating systems. Mainly it's the dot "." at the start, you see?

If you don't see, you'll need to disable your operating system's invisible file functions, or use a text editor that allows you to open hidden files, something like bbedit on the Mac platform. On windows, showing invisibles in explorer should allow any text editor to open them, and most decent editors to save them too**. Linux dudes know how to find them without any help from me.

  • Even notepad can save files beginning with a dot, if you put double-quotes around the name when you save it; i.e.. ".htaccess". You can also use your ftp client to rename files beginning with a dot, even on your local filesystem; works great in FileZilla.

What are htaccess files

Simply put, they are invisible plain text files where one can store server directives. Server directives are anything you might put in an Apache config file (httpd.conf) or even a php.ini**, but unlike those "master" directive files, these .htaccess directives apply only to the folder in which the .htaccess file resides, and all the folders inside.

This ability to plant .htaccess files in any directory of our site allows us to set up a finely-grained tree of server directives, each subfolder inheriting properties from its parent, whilst at the same time adding to, or over-riding certain directives with its own .htaccess file. For instance, you could use .htacces to enable indexes all over your site, and then deny indexing in only certain subdirectories, or deny index listings site-wide, and allow indexing in certain subdirectories. One line in the .htaccess file in your root and your whole site is altered. From here on, I'll probably refer to the main .htaccess in the root of your website as "the master .htaccess file", or "main" .htaccess file.

There's a small performance penalty for all this .htaccess file checking, but not noticeable, and you'll find most of the time it's just on and there's nothing you can do about it anyway, so let's make the most of it..

  • Your main php.ini, that is, unless you are running under phpsuexec, in which case the directives would go inside individual php.ini files

Is .htaccess enabled?

It's unusual, but possible that .htaccess is not enabled on your site. If you are hosting it yourself, it's easy enough to fix; open your httpd.conf in a text editor, and locate this <Directory> section..

Your DocumentRoot may be different, of course..

   # This should be changed to whatever you set DocumentRoot to.
   #
   <Directory "/var/www/htdocs">
   #


..locate the line that reads..

   AllowOverride None


..and change it to..

   AllowOverride All


Restart Apache. Now .htaccess will work. You can also make this change inside a virtual host, which would normally be preferable.

If your site is hosted with someone else, check your control panel (Plesk. CPanel, etc.) to see if you can enable it there, and if not, contact your hosting admins. Perhaps they don't allow this. In which case, switch to a better web host.

Control access

.htaccess is most often used to restrict or deny access to individual files and folders. A typical example would be an "includes" folder. Your site's pages can call these included scripts all they like, but you don't want users accessing these files directly, over the web. In that case you would drop an .htaccess file in the includes folder with content something like this..

NO ENTRY!

   # no one gets in here!
   deny from all


which would deny ALL direct access to ANY files in that folder. You can be more specific with your conditions, for instance limiting access to a particular IP range, here's a handy top-level rule for a local test server..

NO ENTRY outside of the LAN!

   # no nasty crackers in here!
   order deny,allow
   deny from all
   allow from 192.168.0.0/24
   # this would do the same thing..
   #allow from 192.168.0


Generally these sorts of requests would bounce off your firewall anyway, but on a live server (like my dev mirror sometimes is) they become useful for filtering out undesirable IP blocks, known risks, lots of things. By the way, in case you hadn't spotted; lines beginning with "#" are ignored by Apache; handy for comments.

Sometimes, you will only want to ban one IP, perhaps some persistent robot that doesn't play by the rules..

post user agent every fifth request only. hmmm. ban IP..

   # someone else giving the ruskies a bad name..
   order allow,deny
   deny from 83.222.23.219
   allow from all


The usual rules for IP addresses apply, so you can use partial matches, ranges, and so on. Whatever, the user gets a 403 "access denied" error page in their client software (browser, usually), which certainly gets the message across. This is probably fine for most situations, but in part two I'll demonstrate some cooler ways to deny access.

Custom error documents

I guess I should briefly mention that .htaccess is where most folk configure their error documents. Usually with sommething like this..

the usual method. the "err" folder (with the custom pages) is in the root

   # custom error documents
   ErrorDocument 401 /err/401.php
   ErrorDocument 403 /err/403.php
   ErrorDocument 404 /err/404.php
   ErrorDocument 500 /err/500.php


You can also specify external URLs, though this can be problematic, and is best avoided. One quick and simple method is to specify the text in the directive itself, you can even use HTML (though there is probably a limit to how much HTML you can squeeze onto one line). Remember, for Apache 1; begin with a ", but DO NOT end with one. For Apache 2, you can put a second quote at the end, as normal.

measure twice, quote once..

   # quick custom error "document"..
   ErrorDocument 404 "<html><head><title>NO!</title></head><body>blah blah</body></html>


Using a custom error document is a Very Good Idea, and will give you a second chance at your almost-lost visitors. I recommend you download mine. But then, I would.

Password protected directories

  • Get better protection: The authentication examples above assume that your web server supports "Basic" http authorisation, as far as I know they all do (it's in the Apache core). Trouble is, some browsers aren't sending password this way any more, personally I'm looking to php to cover my authorization needs. Basic auth works okay though, even if it isn't actually very secure - your password travels in plain text over the wire, not clever.

If you have php, and are looking for a more secure login facility, check out pajamas. It's free. If you are looking for a password-protected download facility (and much more, besides), check out my distro machine, also free. The next most obvious use for our .htaccess files is to allow access to only specific users, or user groups, in other words; password protected folders. a simple authorisation mechanism might look something like this..

a simple sample .htaccess file for password protection:

   AuthType Basic
   AuthName "restricted area"
   AuthUserFile /usr/local/var/www/html/.htpasses
   require valid-user


You can use this same mechanism to limit only certain kinds of requests, too..

only valid users can POST in here, anyone can GET, PUT, etc:

   AuthType Basic
   AuthName "restricted area"
   AuthUserFile /usr/local/var/www/html/.htpasses
   <Limit POST>
    require valid-user
   </Limit>


You can find loads of online examples of how to setup authorization using .htaccess, and so long as you have a real user (or create one, in this case, 'jimmy') with a real password (you will be prompted for this, twice) in a real password file (the -c switch will create it)..

htpasswd -c /usr/local/var/www/html/.htpasses jimmy

..the above will work just fine. htpasswd is a tool that comes free with Apache, specifically for making and updating password files, check it out. The windows version is the same; only the file path needs to be changed; to wherever you want to put the password file.

Note: if the Apache bin/ folder isn't in your PATH, you will need to cd into that directory before performing the command. Also note: You can use forward and back-slashes interchangeably with Apache/php on Windows, so this would work just fine..

htpasswd -c c:/unix/usr/local/Apache2/conf/.htpasses jimmy

Relative paths are fine too; assuming you were inside the bin/ directory of our fictional Apache install, the following would do exactly the same as the above..

htpasswd -c ../conf/.htpasses jimmy

Naming the password file .htpasses is a habit from when I had to keep that file inside the web site itself, and as web servers are configured to ignore files beginning with .ht, they too, remain hidden. If you keep your password file outside the web root (a better idea), then you can call it whatever you like, but the .ht_something habit is a good one to keep, even inside the web tree, it is secure enough for our basic purpose..

Once they are logged in, you can access the remote_user environmental variable, and do stuff with it..

the remote_user variable is now available..

   RewriteEngine on
   RewriteCond %{remote_user} !^$ [nc]
   RewriteRule ^(.*)$ /users/%{remote_user}/$1


Which is a handy directive, utilizing mod_rewrite; a subject I delve into far more deeply, in part two.