İçindekiler

  • you-have -to-actually-be-a-member-to-view- this-page,-colonel! the-only-time-i-use-that-html-option-is-if-i-am-feeling-particularly-saucy,-since-you-can-have-so-much-more-control-over-the-error-pages-when-used-in-conjunction-with-xssi-or-cgi-or-both.-also-note-that-the-errordocument-starts-with-a-"-just-before-the-html-starts,-but-does-not-end-with-one...it-shouldn't-end-with-one-and-if-you-do-use-that-option,-keep-it-that-way.-and-again,-that-should-all-be-on-one-line,-no-naughty-word-wrapping! next,-we-are-moving-on-to-password-protection,-that-last-frontier-before-i-dunk-you-into-the-true-capabilities-of-htaccess.-if-you-are-familiar-with-setting-up-your-own-password-protected-directories-via-htaccess,-you-may-feel-like-skipping-ahead. password-protection   ever-wanted-a-specific-directory-in-your-site-to-be-available-only-to-people-who-you-want-it-to-be-available-to?-ever-got-frustrated-with-the-seeming-holes-in-client-side-options-for-this-that-allowed-virtually-anyone-with-enough-skill-to-mess-around-in-your-source-to-get-in?-htaccess-is-the-answer! there-are-numerous-methods-to-password-protecting-areas-of-your-site,-some-server-language-based-(such-as-asp,-php-or-perl)-and-client-side-based,-such-as-javascript.-javascript-is-not-as-secure-or-foolproof-as-a-server-side-option,-a-server-side-challenge/response-is-always-more-secure-than-a-client-dependant-challenge/response.-htaccess-is-about-as-secure-as-you-can-or-need-to-get-in-everyday-life,-though-there-are-ways-above-and-beyond-even-that-of-htaccess.-if-you-aren't-comfortable-enough-with-htaccess,-you-can-password-protect-your-pages-any-number-of-ways,-and-javascript-kit-has-plenty-of password-protection scripts-for-your-use. the-first-thing-you-will-need-to-do-is-create-a-file-called .htpasswd.-i-know,-you-might-have-problems-with-the-naming-convention,-but-it-is-the-same-idea-behind-naming-the-htaccess-file-itself,-and-you-should-be-able-to-do-that-by-this-point.-in-the-htpasswd-file,-you-place-the-username-and-password-(which-is-encrypted)-for-those-whom-you-want-to-have-access. for-example,-a-username-and-password-of wsabstract (and-i-do-not-recommend-having-the-username-being-the-same-as-the-password),-the-htpasswd-file-would-look-like-this: wsabstract:y4e7ep8e7eyv notice-that-it-is-username-first,-followed-by-the-password.-there-is-a handy-dandy-tool available-for-you-to-easily-encrypt-the-password-into-the-proper-encoding-for-use-in-the-httpasswd-file. for-security,-you-should-not-upload-the-htpasswd-file-to-a-directory-that-is-web-accessible-(yoursite.com/.htpasswd),-it-should-be-placed above your-www-root-directory.-you'll-be-specifying-the-location-to-it-later-on,-so-be-sure-you-know-where-you-put-it.-also,-this-file,-as-with-htaccess,-should-be-uploaded-as-ascii-and not binary. create-a-new-htaccess-file-and-place-the-following-code-in-it: authuserfile-/usr/local/you/safedir/.htpasswd authgroupfile-/dev/null authname-enterpassword authtype-basic require-user-wsabstract the-first-line-is-the-full-server-path-to-your-htpasswd-file.-if-you-have-installed-scripts-on-your-server,-you-should-be-familiar-with-this.-please-note-that-this-is-not-a-url,-this-is-a-server-path.-also-note-that-if-you-place-this-htaccess-file-in-your-root-directory,-it-will-password-protect-your-entire-site,-which-probably-isn't-your-exact-goal. the-second-to-last-line require-user is-where-you-enter-the-username-of-those-who-you-want-to-have-access-to-that-portion-of-your-site.-note-that-using-this-will-allow-only-that-specific-user-to-be-able-to-access-that-directory.-this-applies-if-you-had-an-htpasswd-file-that-had-multiple-users-setup-in-it-and-you-wanted-each-one-to-have-access-to-an-individual-directory.-if-you-wanted-the-entire-list-of-users-to-have-access-to-that-directory,-you-would-replace require-user-xxx with require-valid-user. the authname is-the-name-of-the-area-you-want-to-access.-it-could-anything,-such-as-"enterpassword".-you-can-change-the-name-of-this-'realm'-to-whatever-you-want,-within-reason. we-are-using authtype-basic because-we-are-using-basic-http-authentication. enabling-ssi-via-htaccess   many-people-want-to-use-ssi,-but-don't-seem-to-have-the-ability-to-do-so-with-their-current-web-host.-you-can-change-that-with-htaccess.-a-note-of-caution-first...definitely-ask-permission-from-your-host-before-you-do-this,-it-can-be-considered-'hacking'-or-violation-of-your-host's-tos,-so-be-safe-rather-than-sorry: addtype-text/html-.shtml addhandler-server-parsed-.shtml options-indexes-followsymlinks-includes the-first-line-tells-the-server-that-pages-with-a-.shtml-extension-(for-server-parsed-html)-are-valid.-the-second-line-adds-a-handler,-the-actual-ssi-bit,-in-all-files-named-.shtml.-this-tells-the-server-that-any-file-named-.shtml-should-be-parsed-for-server-side-commands.-the-last-line-is-just-techno-junk-that-you-should-throw-in-there. and-that's-it,-you-should-have-ssi-enabled.-but-wait...don't-feel-like-renaming-all-of-your-pages-to-.shtml-in-order-to-take-advantage-of-this-neat-little-toy?-me-either!-just-add-this-line-to-the-fragment-above,-between-the-first-and-second-lines: addhandler-server-parsed-.html a-note-of-caution-on-that-one-too,-however.-this-will-force-the-server-to-parse-every-page-named-.html-for-ssi-commands,-even-if-they-have-no-ssi-commands-within-them.-if-you-are-using-ssi-sparingly-on-your-site,-this-is-going-to-give-you-more-server-drain-than-you-can-justify.-ssi-does-slow-down-a-server-because-it-does-extra-stuff-before-serving-up-a-page,-although-in-human-terms-of-speed,-it-is-virtually-transparent.-some-people-also-prefer-to-allow-ssi-in-html-pages-so-as-to-avoid-letting-anyone-who-looks-at-the-page-extension-to-know-that-they-are-using-ssi-in-order-to-prevent-the-server-being-compromised-through-ssi-hacks,-which-is-possible.-either-way,-you-now-have-the-knowledge-to-use-it-either-way. if,-however,-you-are-going-to-keep-ssi-pages-with-the-extension-of-.shtml,-and-you-want-to-use-ssi-on-your-index-pages,-you-need-to-add-the-following-line-to-your-htaccess: directoryindex-index.shtml-index.html this-allows-a-page-named-index.shtml-to-be-your-default-page,-and-if-that-is-not-found,-index.html-is-loaded.-more-on-directoryindex-later. blocking-users-by-ip   is-there-a-pesky-person-perpetrating-pain-upon-you?-stalking-your-site-from-the-vastness-of-the-electron-void?-blockem!-in-your-htaccess-file,-add-the-following-code--changing-the-ips-to-suit-your-needs--each-command-on-one-line-each: order-allow,deny deny-from-123.45.6.7 deny-from-012.34.5. allow-from-all you-can-deny-access-based-upon-ip-address-or-an-ip-block.-the-above-blocks-access-to-the-site-from-123.45.6.7,-and-from-any-sub-domain-under-the-ip-block-012.34.5.-(012.34.5.1,-012.34.5.2,-012.34.5.3,-etc.)-i-have-yet-to-find-a-useful-application-of-this,-maybe-if-there-is-a-site-scraping-your-content-you-can-block-them,-who-knows. you-can-also-set-an-option-for deny-from-all,-which-would-of-course-deny-everyone.-you-can-also-allow-or-deny-by-domain-name-rather-than-ip-address-(allow-from-.javascriptkit.com works-for-www.javascriptkit.com-or-virtual.javascriptkit.com,-etc.) blocking-users/-sites-by-referrer note: this-portion-of-tutorial-written-by javascript-kit   blocking-users-or-sites-that-originate-from-a-particular-domain-is-another-useful-trick-of-.htaccess.-lets-say-you-check-your-logs-one-day,-and-see-tons-of-referrals-from-a-particular-site,-yet-upon-inspection-you-can't-find-a-single-visible-link-to-your-site-on-theirs.-the-referral-isn't-a-"legitimate"-one,-with-the-site-most-likely-hot-linking-to-certain-files-on-your-site-such-as-images,-.css-files,-or-files-you-can't-even-make-out.-remember,-your-logs-will-generate-a-referrer-entry-for-any-kind-of-reference-to-your-site-that-has-a-traceable-origin. before-i-get-to-the-code-itself,-it's-important-to-note-that-blocking-access-by-referrer-in-.htaccess-requires-the-help-of-the-apache-module mod_rewrite to-make-out-the-referrer-first.-this-module-is-installed-by-default-on-most-servers-(ask-your-host-if-you're-not-sure).-so,-to-deny-access-all-traffic-that-originate-from-a-particular-domain-(referrers)-to-your-site,-use-the-following-code: block-traffic-from-a-single-referrer: rewriteengine-on #-options-+followsymlinks rewritecond-%{http_referer}-badsite.com-[nc] rewriterule-.*---[f] block-traffic-from-multiple-referrers rewriteengine-on #-options-+followsymlinks rewritecond-%{http_referer}-badsite.com-[nc,or] rewritecond-%{http_referer}-anotherbadsite.com rewriterule-.*---[f] in-the-"single-referrer"-case-above,-"badsite.com"-is-the-domain-you-wish-to-block.-note-the-backslash-proceeding-the-period-(".")-to-actually-donate-a-period,-as-in regular-expressions,-a-period-donates-any-character,-which-is-not-what-we-want.-the-flag-"[nc]"-is-added-to-the-end-of-the-domain-to-make-it-case-insensitive,-so-whether-the-domain-is-"badsite.com",-"badsite.com"-etc,-however-bad-it-gets,-it-gets-blocked.-finally,-the-last-line-in-the-.htaccess-file-specifies-that-the-action-to-take-when-a-match-is-found-is-to-fail-the-request,-meaning-the-referrer-traffic-will-hit-a-403-forbidden-error.-the-only-difference-between-blocking-a-single-referrer-and-multiple-referrers-is-the-modified-[nc,-or]-flag-in-the-later-case-to-every-domain-but-the-last. now,-you-may-have-noticed-the-line-"options-+followsymlinks"-above,-which-is-commented.-uncomment this-line-if-your-server-isn't-configured-with-followsymlinks-in-its--section-in-httpd.conf,-and-you-get-a-500-internal-server-error-when-using-the-code-above-as-is. blocking-bad-bots-and-site-rippers-(aka-offline-browsers) note: this-portion-of-tutorial-written-by javascript-kit   the-definition-of-a-"bad-bot"-varies-depending-on-who-you-ask,-but-most-would-agree-they-are-the-spiders-that-do-a-lot-more-harm-than-good-on-your-site-(ie:-an-email-harvester).-a-site-ripper-on-the-other-hand-are-offline-browsing-programs-that-a-surfer-may-unleash-on-your-site-to-crawl-and-download-every-one-of-its-pages-for-offline-viewing.-in-both-cases,-both-your-site's-bandwidth-and-resource-usage-are-jacked-up-as-a-result,-sometimes-to-the-point-of-crashing-your-server.-bad-bots-typically-ignore-the-wishes-of-your robots.txt-file,-so-you'll-want-to-ban-them-using-means-such-as-.htaccess.-the-trick-is-to-identify-a-bad-bot. below-is-a-useful-code-block-you-can-insert-into.htaccess-file-for-blocking-a-lot-of-the-known-bad-bots-and-site-rippers-currently-out-there.-it-is-derived-from-my-reading-of-the-excellent-discussion-"a-close-to-perfect-.htaccess-file",-specifically,-"a-close-to-perfect-.htaccess-file-ii."-simply-add-the-below-code-to-your-.htaccess-file: rewriteengine-on- rewritecond-%{http_user_agent}-^blackwidow-[or]- rewritecond-%{http_user_agent}-^bot-mailto:craftbot@yahoo.com-[or]- rewritecond-%{http_user_agent}-^chinaclaw-[or]- rewritecond-%{http_user_agent}-^custo-[or]- rewritecond-%{http_user_agent}-^disco-[or]- rewritecond-%{http_user_agent}-^download-demon-[or]- rewritecond-%{http_user_agent}-^ecatch-[or]- rewritecond-%{http_user_agent}-^eirgrabber-[or]- rewritecond-%{http_user_agent}-^emailsiphon-[or]- rewritecond-%{http_user_agent}-^emailwolf-[or]- rewritecond-%{http_user_agent}-^express-webpictures-[or]- rewritecond-%{http_user_agent}-^extractorpro-[or]- rewritecond-%{http_user_agent}-^eyenetie-[or]- rewritecond-%{http_user_agent}-^flashget-[or]- rewritecond-%{http_user_agent}-^getright-[or]- rewritecond-%{http_user_agent}-^getweb!-[or]- rewritecond-%{http_user_agent}-^go!zilla-[or]- rewritecond-%{http_user_agent}-^go-ahead-got-it-[or]- rewritecond-%{http_user_agent}-^grabnet-[or]- rewritecond-%{http_user_agent}-^grafula-[or]- rewritecond-%{http_user_agent}-^hmview-[or]- rewritecond-%{http_user_agent}-httrack-[nc,or]- rewritecond-%{http_user_agent}-^image-stripper-[or]- rewritecond-%{http_user_agent}-^image-sucker-[or]- rewritecond-%{http_user_agent}-indy-library-[nc,or]- rewritecond-%{http_user_agent}-^interget-[or]- rewritecond-%{http_user_agent}-^internet-ninja-[or]- rewritecond-%{http_user_agent}-^jetcar-[or]- rewritecond-%{http_user_agent}-^joc-web-spider-[or]- rewritecond-%{http_user_agent}-^larbin-[or]- rewritecond-%{http_user_agent}-^leechftp-[or]- rewritecond-%{http_user_agent}-^mass-downloader-[or]- rewritecond-%{http_user_agent}-^midown-tool-[or]- rewritecond-%{http_user_agent}-^mister-pix-[or]- rewritecond-%{http_user_agent}-^navroad-[or]- rewritecond-%{http_user_agent}-^nearsite-[or]- rewritecond-%{http_user_agent}-^netants-[or]- rewritecond-%{http_user_agent}-^netspider-[or]- rewritecond-%{http_user_agent}-^net-vampire-[or]- rewritecond-%{http_user_agent}-^netzip-[or]- rewritecond-%{http_user_agent}-^octopus-[or]- rewritecond-%{http_user_agent}-^offline-explorer-[or]- rewritecond-%{http_user_agent}-^offline-navigator-[or]- rewritecond-%{http_user_agent}-^pagegrabber-[or]- rewritecond-%{http_user_agent}-^papa-foto-[or]- rewritecond-%{http_user_agent}-^pavuk-[or]- rewritecond-%{http_user_agent}-^pcbrowser-[or]- rewritecond-%{http_user_agent}-^realdownload-[or]- rewritecond-%{http_user_agent}-^reget-[or]- rewritecond-%{http_user_agent}-^sitesnagger-[or]- rewritecond-%{http_user_agent}-^smartdownload-[or]- rewritecond-%{http_user_agent}-^superbot-[or]- rewritecond-%{http_user_agent}-^superhttp-[or]- rewritecond-%{http_user_agent}-^surfbot-[or]- rewritecond-%{http_user_agent}-^takeout-[or]- rewritecond-%{http_user_agent}-^teleport-pro-[or]- rewritecond-%{http_user_agent}-^voideye-[or]- rewritecond-%{http_user_agent}-^web-image-collector-[or]- rewritecond-%{http_user_agent}-^web-sucker-[or]- rewritecond-%{http_user_agent}-^webauto-[or]- rewritecond-%{http_user_agent}-^webcopier-[or]- rewritecond-%{http_user_agent}-^webfetch-[or]- rewritecond-%{http_user_agent}-^webgo-is-[or]- rewritecond-%{http_user_agent}-^webleacher-[or]- rewritecond-%{http_user_agent}-^webreaper-[or]- rewritecond-%{http_user_agent}-^websauger-[or]- rewritecond-%{http_user_agent}-^website-extractor-[or]- rewritecond-%{http_user_agent}-^website-quester-[or]- rewritecond-%{http_user_agent}-^webstripper-[or]- rewritecond-%{http_user_agent}-^webwhacker-[or]- rewritecond-%{http_user_agent}-^webzip-[or]- rewritecond-%{http_user_agent}-^wget-[or]- rewritecond-%{http_user_agent}-^widow-[or]- rewritecond-%{http_user_agent}-^wwwoffle-[or]- rewritecond-%{http_user_agent}-^xaldon-webspider-[or]- rewritecond-%{http_user_agent}-^zeus- rewriterule-^.*---[f,l] bots-that-are-listed-above-will-all-receive-a-403-forbidden-error-when-trying-to-view-your-site.-the-amount-of-bandwidth-savings-and-decrease-in-server-resource-usage-as-a-result-may-be-significant-in-many-cases. change-your-default-directory-page   some-of-you-may-be-wondering,-just-what-in-the-world-is-a-directoryindex?-well,-grasshopper,-this-is-a-command-which-allows-you-to-specify-a-file-that-is-to-be-loaded-as-your-default-page-whenever-a-directory-or-url-request-comes-in,-that-does-not-specify-a-specific-page.-tired-of-having-yoursite.com/index.html-come-up-when-you-go-to-yoursite.com?-want-to-change-it-to-be-yoursite.com/ilikepizzasteve.html-that-comes-up-instead?-no-problem! directoryindex-filename.html this-would-cause-filename.html-to-be-treated-as-your-default-page,-or-default-directory-page.-you-can-also-append-other-filenames-to-it.-you-may-want-to-have-certain-directories-use-a-script-as-a-default-page.-that's-no-problem-too! directoryindex-filename.html-index.cgi-index.pl-default.htm placing-the-above-command-in-your-htaccess-file-will-cause-this-to-happen:-when-a-user-types-in-yoursite.com,-your-site-will-look-for-filename.html-in-your-root-directory-(or-any-directory-if-you-specify-this-in-the-global-htaccess),-and-if-it-finds-it,-it-will-load-that-page-as-the-default-page.-if-it-does-not-find-filename.html,-it-will-then-look-for-index.cgi;-if-it-finds-that-one,-it-will-load-it,-if-not,-it-will-look-for-index.pl-and-the-whole-process-repeats-until-it-finds-a-file-it-can-use.-basically,-the-list-of-files-is-read-from-left-to-right. every-once-in-a-while,-i-use-this-method-for-the-following-needs:-say-i-keep-all-my-include-files-in-a-directory-called include,-and-that-i-keep-all-my-image-files-in-a-directory-called images,-i-don't-want-people-to-be-able-to-directory-browse-through-them-(even-though-we-can-prevent-that-through-another-htaccess-trick,-more-later)-i-would-specify-a-directoryindex-entry,-in-a-specific-htaccess-file-for-those-two-directories,-for-/redirect/index.pl-that-is-a-redirect-page-(as-explained here)-that-redirects-a-request-for-those-directories-to-be-sent-to-the-homepage.-or-i-could-just-specify-a-directory-index-of-index.pl-and-upload-an-index.pl-file-to-each-of-those-directories.-or-i-could-just-stick-in-an-htaccess-redirect-page,-which-is-our-next-subject! redirects   ever-go-through-the-nightmare-of-changing-significantly-portions-of-your-site,-then-having-to-deal-with-the-problem-of-people-finding-their-way-from-the-old-pages-to-the-new?-it-can-be-nasty.-there-are-different-ways-of-redirecting-pages,-through-http-equiv,-javascript-or-any-of-the-server-side-languages.-and-then-you-can-do-it-through-htaccess,-which-is-probably-the-most-effective,-considering-the-minimal-amount-of-work-required-to-do-it. htaccess-uses-redirect-to-look-for-any-request-for-a-specific-page-(or-a-non-specific-location,-though-this-can-cause-infinite-loops)-and-if-it-finds-that-request,-it-forwards-it-to-a-new-page-you-have-specified: redirect-/olddirectory/oldfile.html-http://yoursite.com/newdirectory/newfile.html note-that-there-are-3-parts-to-that,-which-should-all-be-on-one-line-:-the redirect command,-the-location-of-the-file/directory-you-want-redirected-relative-to-the-root-of-your-site-(/olddirectory/oldfile.html-=-yoursite.com/olddirectory/oldfile.html)-and-the-full-url-of-the-location-you-want-that-request-sent-to.-each-of-the-3-is-separated-by-a-single-space,-but-all-on-one-line.-you-can-also-redirect-an-entire-directory-by-simple-using redirect-/olddirectory-http://yoursite.com/newdirectory/ using-this-method,-you-can-redirect-any-number-of-pages-no-matter-what-you-do-to-your-directory-structure.-it-is-the-fastest-method-that-is-a-global-affect. prevent-viewing-of-.htaccess-file   if-you-use-htaccess-for-password-protection,-then-the-location-containing-all-of-your-password-information-is-plainly-available-through-the-htaccess-file.-if-you-have-set-incorrect-permissions-or-if-your-server-is-not-as-secure-as-it-could-be,-a-browser-has-the-potential-to-view-an-htaccess-file-through-a-standard-web-interface-and-thus-compromise-your-site/server.-this,-of-course,-would-be-a-bad-thing.-however,-it-is-possible-to-prevent-an-htaccess-file-from-being-viewed-in-this-manner: order-allow,deny deny-from-all the-first-line-specifies-that-the-file-named .htaccess is-having-this-rule-applied-to-it.-you-could-use-this-for-other-purposes-as-well-if-you-get-creative-enough. if-you-use-this-in-your-htaccess-file,-a-person-trying-to-see-that-file-would-get-returned-(under-most-server-configurations)-a-403-error-code.-you-can-also-set-permissions-for-your-htaccess-file-via-chmod,-which-would-also-prevent-this-from-happening,-as-an-added-measure-of-security:-644-or-rw-r--r-- adding-mime-types   what-if-your-server-wasn't-set-up-to-deliver-certain-file-types-properly?-a-common-occurrence-with-mp3-or-even-swf-files.-simple-enough-to-fix: addtype-application/x-shockwave-flash-swf addtype is-specifying-that-you-are-adding-a-mime-type.-the application string-is-the-actual-parameter-of-the-mime-you-are-adding,-and-the-final-little-bit-is-the-default-extension-for-the-mime-type-you-just-added,-in-our-example-this-is swf for-shockwave-file. by-the-way,-here's-a-neat-little-trick-that-few-know-about,-but-you-get-to-be-part-of-the-club-since-javascript-kit-loves-you:-to-force-a-file-to-be-downloaded,-via-the-save-as-browser-feature,-you-can-simply-set-a-mime-type-to application/octet-stream and-that-immediately-prompts-you-for-the-download.-i-have-no-idea-how-that-would-be-useful,-but-that-question-has-come-up-in-our forums from-time-to-time,-so-there-ya'-go.   preventing-hot-linking-of-images-and-other-file-types note: this-portion-of-tutorial-written-by javascript-kit   in-the-webmaster-community,-"hot-linking"-is-a-curse-phrase.-also-known-as-"bandwidth-stealing"-by-the-angry-site-owner, -it-refers-to-linking-directly-to-non-html-objects-not-on-one-own's-server,-such-as-images,-.js-files-etc.-the-victim's-server-in-this-case-is-robbed-of-bandwidth-(and-in-turn-money)-as-the-violator-enjoys-showing-content-without-having-to-pay-for-its-deliverance.-the-most-common-practice-of-hot-linking-pertains-to-another-site's-images. using-.htaccess,-you-can-disallow-hot-linking-on-your-server,-so-those-attempting-to-link-to-an-image-or-css-file-on-your-site,-for-example,-is-either-blocked-(failed-request,-such-as-a-broken-image)-or-served-a-different-content-(ie:-an-image-of-an-angry-man)-.-note-that mod_rewrite needs-to-be-enabled-on-your-server-in-order-for-this-aspect-of-.htaccess-to-work.-inquire-your-web-host-regarding-this. with-all-the-pieces-in-place,-here's-how-to-disable-hot-linking-of-certain-file-types-on-your-site,-in-the-case-below,-images,-javascript-(js)-and-css-(css)-files-on-your-site.-simply-add-the-below-code-to-your-.htaccess-file,-and-upload-the-file-either-to-your-root-directory,-or-a-particular-subdirectory-to-localize-the-effect-to-just-one-section-of-your-site: rewriteengine-on rewritecond-%{http_referer}-!^$ rewritecond-%{http_referer}-!^http://(www.)?mydomain.com/.*$-[nc] rewriterule-.(gif|jpg|js|css)$---[f] be-sure-to-replace-"mydomain.com"-with-your-own.-the-above-code-creates-a-failed-request-when-hot-linking-of-the-specified-file-types-occurs.-in-the-case-of-images,-a-broken-image-is-shown-instead. serving-alternate-content-when-hot-linking-is-detected you-can-set-up-your-.htaccess-file-to-actually-serve-up-different-content-when-hot-linking-occurs.-this-is-more-commonly-done-with-images,-such-as-serving-up-an-angry-man-image-in-place-of-the-hot-linked-one.-the-code-for-this-is: rewriteengine-on rewritecond-%{http_referer}-!^$ rewritecond-%{http_referer}-!^http://(www.)?mydomain.com/.*$-[nc] rewriterule-.(gif|jpg)$-http://www.mydomain.com/angryman.gif-[r,l] same-deal--replace-mydomain.com-with-your-own,-plus-angryman.gif. time-to-pour-a-bucket-of-cold-water-on-hot-linking! preventing-directory-listing   do-you-have-a-directory-full-of-images-or-zips-that-you-do-not-want-people-to-be-able-to-browse-through?-typically-a-server-is-setup-to-prevent-directory-listing,-but-sometimes-they-are-not.-if-not,-become-self-sufficient-and-fix-it-yourself: indexignore-* the-*-is-a-wildcard-that-matches-all-files,-so-if-you-stick-that-line-into-an-htaccess-file-in-your-images-directory,-nothing-in-that-directory-will-be-allowed-to-be-listed. on-the-other-hand,-what-if-you-did-want-the-directory-contents-to-be-listed,-but-only-if-they-were-html-pages-and-not-images?-simple-says-i: indexignore-*.gif-*.jpg this-would-return-a-list-of-all-files not ending-in-.jpg-or-.gif,-but-would-still-list-.txt,-.html,-etc. and-conversely,-if-your-server-is-setup-to-prevent-directory-listing,-but-you-want-to-list-the-directories-by-default,-you-could-simply-throw-this-into-an-htaccess-file-the-directory-you-want-displayed: options-+indexes if-you-do-use-this-option,-be-very-careful-that-you-do-not-put-any-unintentional-or-compromising-files-in-this-directory.-and-if-you-guessed-it-by-the-plus-sign-before-indexes,-you-can-throw-in-a-minus-sign-(options--indexes)-to-prevent-directory-listing-entirely--this-is-typical-of-most-server-setups-and-is-usually-configured-elsewhere-in-the-apache-server,-but-can-be-overridden-through-htaccess. if-you-really-want-to-be-tricky,-using-the-+indexes-option,-you-can-include-a-default-description-for-the-directory-listing-that-is-displayed-when-you-use-it-by-placing-a-file-called header in-the-same-directory.-the-contents-of-this-file-will-be-printed-out-before-the-list-of-directory-contents-is-listed.-you-can-also-specify-a-footer,-though-it-is-called readme,-by-placing-it-in-the-same-directory-as-the-header.-the-readme-file-is-printed-out-after-the-directory-listing-is-printed. conclusion-&-more-information   of-course,-i-can't-list-every-possible-use-of-htaccess-here,-just-the-more-notable-and-useful-ones-(read:-for-fun-and-profit).-there-is-a-list-of apache-directives you-can-use-for-your-htaccess-files,-though-not-all-of-them-are-designed-to-be-used-by-htaccess.-consult-the-documentation-for-the-directive-you-are-looking-to-use-and-make-sure-that-you-can-actually-use-it-as-an-htaccess-string. you-should-also-go-through-the apache-user's-guide for-more-detailed-information-if-you-are-really-serious-about-making-your-life-easier-as-a-webmaster.-you-don't-need-to-update-all-4,000-of-the-pages-on-your-site-individually,-by-hand,-in-order-to-change-one-file-reference...honestly! in-any-event,-i-hope-you-got-a-better-idea-of-the-power-available-to-you-through-this-relatively-simple-little-clark-kent-ish-file.-you-really-do-have-the-ability-to-save-yourself-a-lot-of-time-and-grief-by-using-htaccess,-especially-when-you-add-to-that-the-power-of-ssi-and-xssi. ">.htaccess Htaccess'in nerelerde kullanılabileceği hakkında hoş bir ingilizce makale    Comprehensive guide to .htaccessTutorial written and contributed by Feyd, moderator of the JK Forum, with additions by JavaScriptKit.com. Please see tutorial footnote for additional/bio info on author. Last updated: Jan 18th, 06' for additional section.I am sure that most of you have heard of htaccess, if just vaguely, and that you may think you have a fair idea of what can be done with an htaccess file. You are more than likely mistaken about that, however. Regardless, even if you have never heard of htaccess and what it can do for you, the intention of this tutorial is to get you two moving along nicely together.If you have heard of htaccess, chances are that it has been in relation to implementing custom error pages or password protected directories. But there is much more available to you through the marvelously simple .htaccess file.A Few General IdeasAn htaccess file is a simple ASCII file, such as you would create through a text editor like NotePad or SimpleText. Many people seem to have some confusion over the naming convention for the file, so let me get that out of the way.    .htaccess is the file extension. It is not file.htaccess or somepage.htaccess, it is simply named .htaccess    In order to create the file, open up a text editor and save an empty page as .htaccess (or type in one character, as some editors will not let you save an empty page). Chances are that your editor will append its default file extension to the name (ex: for Notepad it would call the file .htaccess.txt). You need to remove the .txt (or other) file extension in order to get yourself htaccessing--yes, I know that isn't a word, but it sounds keen, don't it? You can do this by right clicking on the file and renaming it by removing anything that doesn't say .htaccess. You can also rename it via telnet or your ftp program, and you should be familiar enough with one of those so as not to need explaining.     htaccess files must be uploaded as ASCII mode, not BINARY. You may need to CHMOD the htaccess file to 644 or (RW-R--R--). This makes the file usable by the server, but prevents it from being read by a browser, which can seriously compromise your security. (For example, if you have password protected directories, if a browser can read the htaccess file, then they can get the location of the authentication file and then reverse engineer the list to get full access to any portion that you previously had protected. There are different ways to prevent this, one being to place all your authentication files above the root directory so that they are not www accessible, and the other is through an htaccess series of commands that prevents itself from being accessed by a browser, more on that later)    Most commands in htaccess are meant to be placed on one line only, so if you use a text editor that uses word-wrap, make sure it is disabled or it might throw in a few characters that annoy Apache to no end, although Apache is typically very forgiving of malformed content in an htaccess file.    htaccess is an Apache thing, not an NT thing. There are similar capabilities for NT servers, though in my professional experience and personal opinion, NT's ability in these areas is severely handicapped. But that's not what we're here for.     htaccess files affect the directory they are placed in and all sub-directories, that is an htaccess file located in your root directory (yoursite.com) would affect yoursite.com/content, yoursite.com/content/contents, etc. It is important to note that this can be prevented (if, for example, you did not want certain htaccess commands to affect a specific directory) by placing a new htaccess file within the directory you don't want affected with certain changes, and removing the specific command(s) from the new htaccess file that you do not want affecting this directory. In short, the nearest htaccess file to the current directory is treated as the htaccess file. If the nearest htaccess file is your global htaccess located in your root, then it affects every single directory in your entire site.    Before you go off and plant htaccess everywhere, read through this and make sure you don't do anything redundant, since it is possible to cause an infinite loop of redirects or errors if you place something weird in the htaccess.    Also...some sites do not allow use of htaccess files, since depending on what they are doing, they can slow down a server overloaded with domains if they are all using htaccess files. I can't stress this enough: You need to make sure you are allowed to use htaccess before you actually use it. Some things that htaccess can do can compromise a server configuration that has been specifically setup by the admin, so don't get in trouble.     Error Documents     This seems to be what people think htaccess was meant for, but it is only part of the general use. We'll be getting into progressively more advanced stuff after this. Successful Client Requests 200 OK 201 Created 202 Accepted 203 Non-Authorative Information 204 No Content 205 Reset Content 206 Partial Content Client Request Redirected 300 Multiple Choices 301 Moved Permanently 302 Moved Temporarily 303 See Other 304 Not Modified 305 Use Proxy Client Request Errors 400 Bad Request 401 Authorization Required 402 Payment Required (not used yet) 403 Forbidden 404 Not Found 405 Method Not Allowed 406 Not Acceptable (encoding) 407 Proxy Authentication Required   408 Request Timed Out 409 Conflicting Request 410 Gone 411 Content Length Required 412 Precondition Failed 413 Request Entity Too Long 414 Request URI Too Long 415 Unsupported Media Type Server Errors 500 Internal Server Error 501 Not Implemented 502 Bad Gateway   503 Service Unavailable   504 Gateway Timeout   505 HTTP Version Not Supported   In order to specify your own ErrorDocuments, you need to be slightly familiar with the server returned error codes. (List to the right). You do not need to specify error pages for all of these, in fact you shouldn't. An ErrorDocument for code 200 would cause an infinite loop, whenever a page was found...this would not be good. You will probably want to create an error document for codes 404 and 500, at the least 404 since this would give you a chance to handle requests for pages not found. 500 would help you out with internal server errors in any scripts you have running. You may also want to consider ErrorDocuments for 401 - Authorization Required (as in when somebody tries to enter a protected area of your site without the proper credentials), 403 - Forbidden (as in when a file with permissions not allowing it to be accessed by the user is requested) and 400 - Bad Request, which is one of those generic kind of errors that people get to by doing some weird stuff with your URL or scripts. In order to specify your own customized error documents, you simply need to add the following command, on one line, within your htaccess file: ErrorDocument code /directory/filename.extorErrorDocument 404 /errors/notfound.htmlThis would cause any error code resulting in 404 to be forward to yoursite.com/errors/notfound.htmlLikewise with:ErrorDocument 500 /errors/internalerror.html You can name the pages anything you want (I'd recommend something that would prevent you from forgetting what the page is being used for), and you can place the error pages anywhere you want within your site, so long as they are web-accessible (through a URL). The initial slash in the directory location represents the root directory of your site, that being where your default page for your first-level domain is located. I typically prefer to keep them in a separate directory for maintenance purposes and in order to better control spiders indexing them through a ROBOTS.TXT file, but it is entirely up to you. If you were to use an error document handler for each of the error codes I mentioned, the htaccess file would look like the following (note each command is on its own line): ErrorDocument 400 /errors/badrequest.html ErrorDocument 401 /errors/authreqd.html ErrorDocument 403 /errors/forbid.html ErrorDocument 404 /errors/notfound.html ErrorDocument 500 /errors/serverr.html You can specify a full URL rather than a virtual URL in the ErrorDocument string (http://yoursite.com/errors/notfound.html vs. /errors/notfound.html). But this is not the preferred method by the server's happiness standards. You can also specify HTML, believe it or not! ErrorDocument 401 "

    You have to actually BE a member to view this page, Colonel! The only time I use that HTML option is if I am feeling particularly saucy, since you can have so much more control over the error pages when used in conjunction with xSSI or CGI or both. Also note that the ErrorDocument starts with a " just before the HTML starts, but does not end with one...it shouldn't end with one and if you do use that option, keep it that way. And again, that should all be on one line, no naughty word wrapping! Next, we are moving on to password protection, that last frontier before I dunk you into the true capabilities of htaccess. If you are familiar with setting up your own password protected directories via htaccess, you may feel like skipping ahead. Password protection   Ever wanted a specific directory in your site to be available only to people who you want it to be available to? Ever got frustrated with the seeming holes in client-side options for this that allowed virtually anyone with enough skill to mess around in your source to get in? htaccess is the answer! There are numerous methods to password protecting areas of your site, some server language based (such as ASP, PHP or PERL) and client side based, such as JavaScript. JavaScript is not as secure or foolproof as a server-side option, a server side challenge/response is always more secure than a client dependant challenge/response. htaccess is about as secure as you can or need to get in everyday life, though there are ways above and beyond even that of htaccess. If you aren't comfortable enough with htaccess, you can password protect your pages any number of ways, and JavaScript Kit has plenty of password protection scripts for your use. The first thing you will need to do is create a file called .htpasswd. I know, you might have problems with the naming convention, but it is the same idea behind naming the htaccess file itself, and you should be able to do that by this point. In the htpasswd file, you place the username and password (which is encrypted) for those whom you want to have access. For example, a username and password of wsabstract (and I do not recommend having the username being the same as the password), the htpasswd file would look like this: wsabstract:y4E7Ep8e7EYV Notice that it is UserName first, followed by the Password. There is a handy-dandy tool available for you to easily encrypt the password into the proper encoding for use in the httpasswd file. For security, you should not upload the htpasswd file to a directory that is web accessible (yoursite.com/.htpasswd), it should be placed above your www root directory. You'll be specifying the location to it later on, so be sure you know where you put it. Also, this file, as with htaccess, should be uploaded as ASCII and not BINARY. Create a new htaccess file and place the following code in it: AuthUserFile /usr/local/you/safedir/.htpasswd AuthGroupFile /dev/null AuthName EnterPassword AuthType Basic require user wsabstract The first line is the full server path to your htpasswd file. If you have installed scripts on your server, you should be familiar with this. Please note that this is not a URL, this is a server path. Also note that if you place this htaccess file in your root directory, it will password protect your entire site, which probably isn't your exact goal. The second to last line require user is where you enter the username of those who you want to have access to that portion of your site. Note that using this will allow only that specific user to be able to access that directory. This applies if you had an htpasswd file that had multiple users setup in it and you wanted each one to have access to an individual directory. If you wanted the entire list of users to have access to that directory, you would replace Require user xxx with require valid-user. The AuthName is the name of the area you want to access. It could anything, such as "EnterPassword". You can change the name of this 'realm' to whatever you want, within reason. We are using AuthType Basic because we are using basic HTTP authentication. Enabling SSI Via htaccess   Many people want to use SSI, but don't seem to have the ability to do so with their current web host. You can change that with htaccess. A note of caution first...definitely ask permission from your host before you do this, it can be considered 'hacking' or violation of your host's TOS, so be safe rather than sorry: AddType text/html .shtml AddHandler server-parsed .shtml Options Indexes FollowSymLinks Includes The first line tells the server that pages with a .shtml extension (for Server parsed HTML) are valid. The second line adds a handler, the actual SSI bit, in all files named .shtml. This tells the server that any file named .shtml should be parsed for server side commands. The last line is just techno-junk that you should throw in there. And that's it, you should have SSI enabled. But wait...don't feel like renaming all of your pages to .shtml in order to take advantage of this neat little toy? Me either! Just add this line to the fragment above, between the first and second lines: AddHandler server-parsed .html A note of caution on that one too, however. This will force the server to parse every page named .html for SSI commands, even if they have no SSI commands within them. If you are using SSI sparingly on your site, this is going to give you more server drain than you can justify. SSI does slow down a server because it does extra stuff before serving up a page, although in human terms of speed, it is virtually transparent. Some people also prefer to allow SSI in html pages so as to avoid letting anyone who looks at the page extension to know that they are using SSI in order to prevent the server being compromised through SSI hacks, which is possible. Either way, you now have the knowledge to use it either way. If, however, you are going to keep SSI pages with the extension of .shtml, and you want to use SSI on your Index pages, you need to add the following line to your htaccess: DirectoryIndex index.shtml index.html This allows a page named index.shtml to be your default page, and if that is not found, index.html is loaded. More on DirectoryIndex later. Blocking users by IP   Is there a pesky person perpetrating pain upon you? Stalking your site from the vastness of the electron void? Blockem! In your htaccess file, add the following code--changing the IPs to suit your needs--each command on one line each: order allow,deny deny from 123.45.6.7 deny from 012.34.5. allow from all You can deny access based upon IP address or an IP block. The above blocks access to the site from 123.45.6.7, and from any sub domain under the IP block 012.34.5. (012.34.5.1, 012.34.5.2, 012.34.5.3, etc.) I have yet to find a useful application of this, maybe if there is a site scraping your content you can block them, who knows. You can also set an option for deny from all, which would of course deny everyone. You can also allow or deny by domain name rather than IP address (allow from .javascriptkit.com works for www.javascriptkit.com or virtual.javascriptkit.com, etc.) Blocking users/ sites by referrer Note: This portion of tutorial written by JavaScript Kit   Blocking users or sites that originate from a particular domain is another useful trick of .htaccess. Lets say you check your logs one day, and see tons of referrals from a particular site, yet upon inspection you can't find a single visible link to your site on theirs. The referral isn't a "legitimate" one, with the site most likely hot linking to certain files on your site such as images, .css files, or files you can't even make out. Remember, your logs will generate a referrer entry for any kind of reference to your site that has a traceable origin. Before I get to the code itself, it's important to note that blocking access by referrer in .htaccess requires the help of the Apache module mod_rewrite to make out the referrer first. This module is installed by default on most servers (ask your host if you're not sure). So, to deny access all traffic that originate from a particular domain (referrers) to your site, use the following code: Block traffic from a single referrer: RewriteEngine on # Options +FollowSymlinks RewriteCond %{HTTP_REFERER} badsite.com [NC] RewriteRule .* - [F] Block traffic from multiple referrers RewriteEngine on # Options +FollowSymlinks RewriteCond %{HTTP_REFERER} badsite.com [NC,OR] RewriteCond %{HTTP_REFERER} anotherbadsite.com RewriteRule .* - [F] In the "single referrer" case above, "badsite.com" is the domain you wish to block. Note the backslash proceeding the period (".") to actually donate a period, as in Regular Expressions, a period donates any character, which is not what we want. The flag "[NC]" is added to the end of the domain to make it case insensitive, so whether the domain is "badsite.com", "Badsite.com" etc, however bad it gets, it gets blocked. Finally, the last line in the .htaccess file specifies that the action to take when a match is found is to fail the request, meaning the referrer traffic will hit a 403 Forbidden error. The only difference between blocking a single referrer and multiple referrers is the modified [NC, OR] flag in the later case to every domain but the last. Now, you may have noticed the line "Options +FollowSymlinks" above, which is commented. Uncomment this line if your server isn't configured with FollowSymLinks in its section in httpd.conf, and you get a 500 Internal Server error when using the code above as is. Blocking bad bots and site rippers (aka offline browsers) Note: This portion of tutorial written by JavaScript Kit   The definition of a "bad bot" varies depending on who you ask, but most would agree they are the spiders that do a lot more harm than good on your site (ie: an email harvester). A site ripper on the other hand are offline browsing programs that a surfer may unleash on your site to crawl and download every one of its pages for offline viewing. In both cases, both your site's bandwidth and resource usage are jacked up as a result, sometimes to the point of crashing your server. Bad bots typically ignore the wishes of your robots.txt file, so you'll want to ban them using means such as .htaccess. The trick is to identify a bad bot. Below is a useful code block you can insert into.htaccess file for blocking a lot of the known bad bots and site rippers currently out there. It is derived from my reading of the excellent discussion "A close to perfect .htaccess file", specifically, "A close to perfect .htaccess file II." Simply add the below code to your .htaccess file: RewriteEngine On RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] RewriteCond %{HTTP_USER_AGENT} ^Bot mailto:craftbot@yahoo.com [OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] RewriteCond %{HTTP_USER_AGENT} ^Download Demon [OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] RewriteCond %{HTTP_USER_AGENT} ^Express WebPictures [OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image Stripper [OR] RewriteCond %{HTTP_USER_AGENT} ^Image Sucker [OR] RewriteCond %{HTTP_USER_AGENT} Indy Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] RewriteCond %{HTTP_USER_AGENT} ^Internet Ninja [OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] RewriteCond %{HTTP_USER_AGENT} ^JOC Web Spider [OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Mass Downloader [OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown tool [OR] RewriteCond %{HTTP_USER_AGENT} ^Mister PiX [OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Net Vampire [OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline Explorer [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline Navigator [OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^Papa Foto [OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport Pro [OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] RewriteCond %{HTTP_USER_AGENT} ^Web Image Collector [OR] RewriteCond %{HTTP_USER_AGENT} ^Web Sucker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo IS [OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] RewriteCond %{HTTP_USER_AGENT} ^Website eXtractor [OR] RewriteCond %{HTTP_USER_AGENT} ^Website Quester [OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] RewriteCond %{HTTP_USER_AGENT} ^Xaldon WebSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus RewriteRule ^.* - [F,L] Bots that are listed above will all receive a 403 Forbidden error when trying to view your site. The amount of bandwidth savings and decrease in server resource usage as a result may be significant in many cases. Change your default directory page   Some of you may be wondering, just what in the world is a DirectoryIndex? Well, grasshopper, this is a command which allows you to specify a file that is to be loaded as your default page whenever a directory or url request comes in, that does not specify a specific page. Tired of having yoursite.com/index.html come up when you go to yoursite.com? Want to change it to be yoursite.com/ILikePizzaSteve.html that comes up instead? No problem! DirectoryIndex filename.html This would cause filename.html to be treated as your default page, or default directory page. You can also append other filenames to it. You may want to have certain directories use a script as a default page. That's no problem too! DirectoryIndex filename.html index.cgi index.pl default.htm Placing the above command in your htaccess file will cause this to happen: When a user types in yoursite.com, your site will look for filename.html in your root directory (or any directory if you specify this in the global htaccess), and if it finds it, it will load that page as the default page. If it does not find filename.html, it will then look for index.cgi; if it finds that one, it will load it, if not, it will look for index.pl and the whole process repeats until it finds a file it can use. Basically, the list of files is read from left to right. Every once in a while, I use this method for the following needs: Say I keep all my include files in a directory called include, and that I keep all my image files in a directory called images, I don't want people to be able to directory browse through them (even though we can prevent that through another htaccess trick, more later) I would specify a DirectoryIndex entry, in a specific htaccess file for those two directories, for /redirect/index.pl that is a redirect page (as explained here) that redirects a request for those directories to be sent to the homepage. Or I could just specify a directory index of index.pl and upload an index.pl file to each of those directories. Or I could just stick in an htaccess redirect page, which is our next subject! Redirects   Ever go through the nightmare of changing significantly portions of your site, then having to deal with the problem of people finding their way from the old pages to the new? It can be nasty. There are different ways of redirecting pages, through http-equiv, javascript or any of the server-side languages. And then you can do it through htaccess, which is probably the most effective, considering the minimal amount of work required to do it. htaccess uses redirect to look for any request for a specific page (or a non-specific location, though this can cause infinite loops) and if it finds that request, it forwards it to a new page you have specified: Redirect /olddirectory/oldfile.html http://yoursite.com/newdirectory/newfile.html Note that there are 3 parts to that, which should all be on one line : the Redirect command, the location of the file/directory you want redirected relative to the root of your site (/olddirectory/oldfile.html = yoursite.com/olddirectory/oldfile.html) and the full URL of the location you want that request sent to. Each of the 3 is separated by a single space, but all on one line. You can also redirect an entire directory by simple using Redirect /olddirectory http://yoursite.com/newdirectory/ Using this method, you can redirect any number of pages no matter what you do to your directory structure. It is the fastest method that is a global affect. Prevent viewing of .htaccess file   If you use htaccess for password protection, then the location containing all of your password information is plainly available through the htaccess file. If you have set incorrect permissions or if your server is not as secure as it could be, a browser has the potential to view an htaccess file through a standard web interface and thus compromise your site/server. This, of course, would be a bad thing. However, it is possible to prevent an htaccess file from being viewed in this manner: order allow,deny deny from all The first line specifies that the file named .htaccess is having this rule applied to it. You could use this for other purposes as well if you get creative enough. If you use this in your htaccess file, a person trying to see that file would get returned (under most server configurations) a 403 error code. You can also set permissions for your htaccess file via CHMOD, which would also prevent this from happening, as an added measure of security: 644 or RW-R--R-- Adding MIME Types   What if your server wasn't set up to deliver certain file types properly? A common occurrence with MP3 or even SWF files. Simple enough to fix: AddType application/x-shockwave-flash swf AddType is specifying that you are adding a MIME type. The application string is the actual parameter of the MIME you are adding, and the final little bit is the default extension for the MIME type you just added, in our example this is swf for ShockWave File. By the way, here's a neat little trick that few know about, but you get to be part of the club since JavaScript Kit loves you: To force a file to be downloaded, via the Save As browser feature, you can simply set a MIME type to application/octet-stream and that immediately prompts you for the download. I have no idea how that would be useful, but that question has come up in our Forums from time to time, so there ya' go.   Preventing hot linking of images and other file types Note: This portion of tutorial written by JavaScript Kit   In the webmaster community, "hot linking" is a curse phrase. Also known as "bandwidth stealing" by the angry site owner,  it refers to linking directly to non-html objects not on one own's server, such as images, .js files etc. The victim's server in this case is robbed of bandwidth (and in turn money) as the violator enjoys showing content without having to pay for its deliverance. The most common practice of hot linking pertains to another site's images. Using .htaccess, you can disallow hot linking on your server, so those attempting to link to an image or CSS file on your site, for example, is either blocked (failed request, such as a broken image) or served a different content (ie: an image of an angry man) . Note that mod_rewrite needs to be enabled on your server in order for this aspect of .htaccess to work. Inquire your web host regarding this. With all the pieces in place, here's how to disable hot linking of certain file types on your site, in the case below, images, JavaScript (js) and CSS (css) files on your site. Simply add the below code to your .htaccess file, and upload the file either to your root directory, or a particular subdirectory to localize the effect to just one section of your site: RewriteEngine on RewriteCond %{HTTP_REFERER} !^$ RewriteCond %{HTTP_REFERER} !^http://(www.)?mydomain.com/.*$ [NC] RewriteRule .(gif|jpg|js|css)$ - [F] Be sure to replace "mydomain.com" with your own. The above code creates a failed request when hot linking of the specified file types occurs. In the case of images, a broken image is shown instead. Serving alternate content when hot linking is detected You can set up your .htaccess file to actually serve up different content when hot linking occurs. This is more commonly done with images, such as serving up an Angry Man image in place of the hot linked one. The code for this is: RewriteEngine on RewriteCond %{HTTP_REFERER} !^$ RewriteCond %{HTTP_REFERER} !^http://(www.)?mydomain.com/.*$ [NC] RewriteRule .(gif|jpg)$ http://www.mydomain.com/angryman.gif [R,L] Same deal- replace mydomain.com with your own, plus angryman.gif. Time to pour a bucket of cold water on hot linking! Preventing Directory Listing   Do you have a directory full of images or zips that you do not want people to be able to browse through? Typically a server is setup to prevent directory listing, but sometimes they are not. If not, become self-sufficient and fix it yourself: IndexIgnore * The * is a wildcard that matches all files, so if you stick that line into an htaccess file in your images directory, nothing in that directory will be allowed to be listed. On the other hand, what if you did want the directory contents to be listed, but only if they were HTML pages and not images? Simple says I: IndexIgnore *.gif *.jpg This would return a list of all files not ending in .jpg or .gif, but would still list .txt, .html, etc. And conversely, if your server is setup to prevent directory listing, but you want to list the directories by default, you could simply throw this into an htaccess file the directory you want displayed: Options +Indexes If you do use this option, be very careful that you do not put any unintentional or compromising files in this directory. And if you guessed it by the plus sign before Indexes, you can throw in a minus sign (Options -Indexes) to prevent directory listing entirely--this is typical of most server setups and is usually configured elsewhere in the apache server, but can be overridden through htaccess. If you really want to be tricky, using the +Indexes option, you can include a default description for the directory listing that is displayed when you use it by placing a file called HEADER in the same directory. The contents of this file will be printed out before the list of directory contents is listed. You can also specify a footer, though it is called README, by placing it in the same directory as the HEADER. The README file is printed out after the directory listing is printed. Conclusion & More Information   Of course, I can't list every possible use of htaccess here, just the more notable and useful ones (read: for fun and profit). There is a list of Apache Directives you can use for your htaccess files, though not all of them are designed to be used by htaccess. Consult the documentation for the directive you are looking to use and make sure that you can actually use it as an htaccess string. You should also go through the Apache User's Guide for more detailed information if you are really serious about making your life easier as a webmaster. You don't need to update all 4,000 of the pages on your site individually, by hand, in order to change one file reference...honestly! In any event, I hope you got a better idea of the power available to you through this relatively simple little Clark Kent-ish file. You really do have the ability to save yourself a lot of time and grief by using htaccess, especially when you add to that the power of SSI and xSSI.

  • Htaccess'in nerelerde kullanılabileceği hakkında hoş bir ingilizce makale
  • Error Documents
  • Password protection
  • Enabling SSI Via htaccess
  • Blocking users by IP
  • Blocking users/ sites by referrer
  • Blocking bad bots and site rippers (aka offline browsers)
  • Change your default directory page
  • Redirects
  • Prevent viewing of .htaccess file
  • Adding MIME Types
  • Preventing hot linking of images and other file types
  • Serving alternate content when hot linking is detected
  • Preventing Directory Listing
  • Conclusion & More Information

.htaccess
Htaccess'in nerelerde kullanılabileceği hakkında hoş bir ingilizce makale

www.dijitalders.com   
Comprehensive guide to .htaccess

Tutorial written and contributed by Feyd, moderator of the JK Forum, with additions by JavaScriptKit.com. Please see tutorial footnote for additional/bio info on author. Last updated: Jan 18th, 06' for additional section.

I am sure that most of you have heard of htaccess, if just vaguely, and that you may think you have a fair idea of what can be done with an htaccess file. You are more than likely mistaken about that, however. Regardless, even if you have never heard of htaccess and what it can do for you, the intention of this tutorial is to get you two moving along nicely together.

If you have heard of htaccess, chances are that it has been in relation to implementing custom error pages or password protected directories. But there is much more available to you through the marvelously simple .htaccess file.
A Few General Ideas

An htaccess file is a simple ASCII file, such as you would create through a text editor like NotePad or SimpleText. Many people seem to have some confusion over the naming convention for the file, so let me get that out of the way.

    .htaccess is the file extension. It is not file.htaccess or somepage.htaccess, it is simply named .htaccess

    In order to create the file, open up a text editor and save an empty page as .htaccess (or type in one character, as some editors will not let you save an empty page). Chances are that your editor will append its default file extension to the name (ex: for Notepad it would call the file .htaccess.txt). You need to remove the .txt (or other) file extension in order to get yourself htaccessing--yes, I know that isn't a word, but it sounds keen, don't it? You can do this by right clicking on the file and renaming it by removing anything that doesn't say .htaccess. You can also rename it via telnet or your ftp program, and you should be familiar enough with one of those so as not to need explaining.



    htaccess files must be uploaded as ASCII mode, not BINARY. You may need to CHMOD the htaccess file to 644 or (RW-R--R--). This makes the file usable by the server, but prevents it from being read by a browser, which can seriously compromise your security. (For example, if you have password protected directories, if a browser can read the htaccess file, then they can get the location of the authentication file and then reverse engineer the list to get full access to any portion that you previously had protected. There are different ways to prevent this, one being to place all your authentication files above the root directory so that they are not www accessible, and the other is through an htaccess series of commands that prevents itself from being accessed by a browser, more on that later)

    Most commands in htaccess are meant to be placed on one line only, so if you use a text editor that uses word-wrap, make sure it is disabled or it might throw in a few characters that annoy Apache to no end, although Apache is typically very forgiving of malformed content in an htaccess file.

    htaccess is an Apache thing, not an NT thing. There are similar capabilities for NT servers, though in my professional experience and personal opinion, NT's ability in these areas is severely handicapped. But that's not what we're here for.



    htaccess files affect the directory they are placed in and all sub-directories, that is an htaccess file located in your root directory (yoursite.com) would affect yoursite.com/content, yoursite.com/content/contents, etc. It is important to note that this can be prevented (if, for example, you did not want certain htaccess commands to affect a specific directory) by placing a new htaccess file within the directory you don't want affected with certain changes, and removing the specific command(s) from the new htaccess file that you do not want affecting this directory. In short, the nearest htaccess file to the current directory is treated as the htaccess file. If the nearest htaccess file is your global htaccess located in your root, then it affects every single directory in your entire site.

    Before you go off and plant htaccess everywhere, read through this and make sure you don't do anything redundant, since it is possible to cause an infinite loop of redirects or errors if you place something weird in the htaccess.

    Also...some sites do not allow use of htaccess files, since depending on what they are doing, they can slow down a server overloaded with domains if they are all using htaccess files. I can't stress this enough: You need to make sure you are allowed to use htaccess before you actually use it. Some things that htaccess can do can compromise a server configuration that has been specifically setup by the admin, so don't get in trouble.

 

 

Error Documents

 

 

This seems to be what people think htaccess was meant for, but it is only part of the general use. We'll be getting into progressively more advanced stuff after this.

Successful Client Requests
200 OK
201 Created
202 Accepted
203 Non-Authorative Information
204 No Content
205 Reset Content
206 Partial Content
Client Request Redirected
300 Multiple Choices
301 Moved Permanently
302 Moved Temporarily
303 See Other
304 Not Modified
305 Use Proxy
Client Request Errors
400 Bad Request
401 Authorization Required
402 Payment Required (not used yet)
403 Forbidden
404 Not Found
405 Method Not Allowed
406 Not Acceptable (encoding)
407 Proxy Authentication Required  
408 Request Timed Out
409 Conflicting Request
410 Gone
411 Content Length Required
412 Precondition Failed
413 Request Entity Too Long
414 Request URI Too Long
415 Unsupported Media Type
Server Errors
500 Internal Server Error
501 Not Implemented
502 Bad Gateway  
503 Service Unavailable  
504 Gateway Timeout  
505 HTTP Version Not Supported  

In order to specify your own ErrorDocuments, you need to be slightly familiar with the server returned error codes. (List to the right). You do not need to specify error pages for all of these, in fact you shouldn't. An ErrorDocument for code 200 would cause an infinite loop, whenever a page was found...this would not be good.

You will probably want to create an error document for codes 404 and 500, at the least 404 since this would give you a chance to handle requests for pages not found. 500 would help you out with internal server errors in any scripts you have running. You may also want to consider ErrorDocuments for 401 - Authorization Required (as in when somebody tries to enter a protected area of your site without the proper credentials), 403 - Forbidden (as in when a file with permissions not allowing it to be accessed by the user is requested) and 400 - Bad Request, which is one of those generic kind of errors that people get to by doing some weird stuff with your URL or scripts.

In order to specify your own customized error documents, you simply need to add the following command, on one line, within your htaccess file:

ErrorDocument code /directory/filename.ext
or
ErrorDocument 404 /errors/notfound.html
This would cause any error code resulting in 404 to be forward to yoursite.com/errors/notfound.html

Likewise with:
ErrorDocument 500 /errors/internalerror.html

You can name the pages anything you want (I'd recommend something that would prevent you from forgetting what the page is being used for), and you can place the error pages anywhere you want within your site, so long as they are web-accessible (through a URL). The initial slash in the directory location represents the root directory of your site, that being where your default page for your first-level domain is located. I typically prefer to keep them in a separate directory for maintenance purposes and in order to better control spiders indexing them through a ROBOTS.TXT file, but it is entirely up to you.

If you were to use an error document handler for each of the error codes I mentioned, the htaccess file would look like the following (note each command is on its own line):

ErrorDocument 400 /errors/badrequest.html
ErrorDocument 401 /errors/authreqd.html
ErrorDocument 403 /errors/forbid.html
ErrorDocument 404 /errors/notfound.html
ErrorDocument 500 /errors/serverr.html

You can specify a full URL rather than a virtual URL in the ErrorDocument string (http://yoursite.com/errors/notfound.html vs. /errors/notfound.html). But this is not the preferred method by the server's happiness standards.

You can also specify HTML, believe it or not!

ErrorDocument 401 "<body bgcolor=#ffffff><h1>You have
 to actually <b>BE</b> a <a href="#">member</A> to view 
this page, Colonel!

The only time I use that HTML option is if I am feeling particularly saucy, since you can have so much more control over the error pages when used in conjunction with xSSI or CGI or both. Also note that the ErrorDocument starts with a " just before the HTML starts, but does not end with one...it shouldn't end with one and if you do use that option, keep it that way. And again, that should all be on one line, no naughty word wrapping!

Next, we are moving on to password protection, that last frontier before I dunk you into the true capabilities of htaccess. If you are familiar with setting up your own password protected directories via htaccess, you may feel like skipping ahead.

Password protection

 

Ever wanted a specific directory in your site to be available only to people who you want it to be available to? Ever got frustrated with the seeming holes in client-side options for this that allowed virtually anyone with enough skill to mess around in your source to get in? htaccess is the answer!

There are numerous methods to password protecting areas of your site, some server language based (such as ASP, PHP or PERL) and client side based, such as JavaScript. JavaScript is not as secure or foolproof as a server-side option, a server side challenge/response is always more secure than a client dependant challenge/response. htaccess is about as secure as you can or need to get in everyday life, though there are ways above and beyond even that of htaccess. If you aren't comfortable enough with htaccess, you can password protect your pages any number of ways, and JavaScript Kit has plenty of password protection scripts for your use.

The first thing you will need to do is create a file called .htpasswd. I know, you might have problems with the naming convention, but it is the same idea behind naming the htaccess file itself, and you should be able to do that by this point. In the htpasswd file, you place the username and password (which is encrypted) for those whom you want to have access.

For example, a username and password of wsabstract (and I do not recommend having the username being the same as the password), the htpasswd file would look like this:

wsabstract:y4E7Ep8e7EYV

Notice that it is UserName first, followed by the Password. There is a handy-dandy tool available for you to easily encrypt the password into the proper encoding for use in the httpasswd file.

For security, you should not upload the htpasswd file to a directory that is web accessible (yoursite.com/.htpasswd), it should be placed above your www root directory. You'll be specifying the location to it later on, so be sure you know where you put it. Also, this file, as with htaccess, should be uploaded as ASCII and not BINARY.

Create a new htaccess file and place the following code in it:

AuthUserFile /usr/local/you/safedir/.htpasswd
AuthGroupFile /dev/null
AuthName EnterPassword
AuthType Basic

require user wsabstract

The first line is the full server path to your htpasswd file. If you have installed scripts on your server, you should be familiar with this. Please note that this is not a URL, this is a server path. Also note that if you place this htaccess file in your root directory, it will password protect your entire site, which probably isn't your exact goal.

The second to last line require user is where you enter the username of those who you want to have access to that portion of your site. Note that using this will allow only that specific user to be able to access that directory. This applies if you had an htpasswd file that had multiple users setup in it and you wanted each one to have access to an individual directory. If you wanted the entire list of users to have access to that directory, you would replace Require user xxx with require valid-user.

The AuthName is the name of the area you want to access. It could anything, such as "EnterPassword". You can change the name of this 'realm' to whatever you want, within reason.

We are using AuthType Basic because we are using basic HTTP authentication.

Enabling SSI Via htaccess

 

Many people want to use SSI, but don't seem to have the ability to do so with their current web host. You can change that with htaccess. A note of caution first...definitely ask permission from your host before you do this, it can be considered 'hacking' or violation of your host's TOS, so be safe rather than sorry:

AddType text/html .shtml
AddHandler server-parsed .shtml
Options Indexes FollowSymLinks Includes

The first line tells the server that pages with a .shtml extension (for Server parsed HTML) are valid. The second line adds a handler, the actual SSI bit, in all files named .shtml. This tells the server that any file named .shtml should be parsed for server side commands. The last line is just techno-junk that you should throw in there.

And that's it, you should have SSI enabled. But wait...don't feel like renaming all of your pages to .shtml in order to take advantage of this neat little toy? Me either! Just add this line to the fragment above, between the first and second lines:

AddHandler server-parsed .html

A note of caution on that one too, however. This will force the server to parse every page named .html for SSI commands, even if they have no SSI commands within them. If you are using SSI sparingly on your site, this is going to give you more server drain than you can justify. SSI does slow down a server because it does extra stuff before serving up a page, although in human terms of speed, it is virtually transparent. Some people also prefer to allow SSI in html pages so as to avoid letting anyone who looks at the page extension to know that they are using SSI in order to prevent the server being compromised through SSI hacks, which is possible. Either way, you now have the knowledge to use it either way.

If, however, you are going to keep SSI pages with the extension of .shtml, and you want to use SSI on your Index pages, you need to add the following line to your htaccess:

DirectoryIndex index.shtml index.html

This allows a page named index.shtml to be your default page, and if that is not found, index.html is loaded. More on DirectoryIndex later.

Blocking users by IP

 

Is there a pesky person perpetrating pain upon you? Stalking your site from the vastness of the electron void? Blockem! In your htaccess file, add the following code--changing the IPs to suit your needs--each command on one line each:

order allow,deny
deny from 123.45.6.7
deny from 012.34.5.
allow from all

You can deny access based upon IP address or an IP block. The above blocks access to the site from 123.45.6.7, and from any sub domain under the IP block 012.34.5. (012.34.5.1, 012.34.5.2, 012.34.5.3, etc.) I have yet to find a useful application of this, maybe if there is a site scraping your content you can block them, who knows.

You can also set an option for deny from all, which would of course deny everyone. You can also allow or deny by domain name rather than IP address (allow from .javascriptkit.com works for www.javascriptkit.com or virtual.javascriptkit.com, etc.)

Blocking users/ sites by referrer

Note: This portion of tutorial written by JavaScript Kit

 

Blocking users or sites that originate from a particular domain is another useful trick of .htaccess. Lets say you check your logs one day, and see tons of referrals from a particular site, yet upon inspection you can't find a single visible link to your site on theirs. The referral isn't a "legitimate" one, with the site most likely hot linking to certain files on your site such as images, .css files, or files you can't even make out. Remember, your logs will generate a referrer entry for any kind of reference to your site that has a traceable origin.

Before I get to the code itself, it's important to note that blocking access by referrer in .htaccess requires the help of the Apache module mod_rewrite to make out the referrer first. This module is installed by default on most servers (ask your host if you're not sure). So, to deny access all traffic that originate from a particular domain (referrers) to your site, use the following code:

Block traffic from a single referrer:

RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} badsite.com [NC]
RewriteRule .* - [F]

Block traffic from multiple referrers

RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} badsite.com [NC,OR]
RewriteCond %{HTTP_REFERER} anotherbadsite.com
RewriteRule .* - [F]

In the "single referrer" case above, "badsite.com" is the domain you wish to block. Note the backslash proceeding the period (".") to actually donate a period, as in Regular Expressions, a period donates any character, which is not what we want. The flag "[NC]" is added to the end of the domain to make it case insensitive, so whether the domain is "badsite.com", "Badsite.com" etc, however bad it gets, it gets blocked. Finally, the last line in the .htaccess file specifies that the action to take when a match is found is to fail the request, meaning the referrer traffic will hit a 403 Forbidden error. The only difference between blocking a single referrer and multiple referrers is the modified [NC, OR] flag in the later case to every domain but the last.

Now, you may have noticed the line "Options +FollowSymlinks" above, which is commented. Uncomment this line if your server isn't configured with FollowSymLinks in its <directory> section in httpd.conf, and you get a 500 Internal Server error when using the code above as is.

Blocking bad bots and site rippers (aka offline browsers)

Note: This portion of tutorial written by JavaScript Kit

 

The definition of a "bad bot" varies depending on who you ask, but most would agree they are the spiders that do a lot more harm than good on your site (ie: an email harvester). A site ripper on the other hand are offline browsing programs that a surfer may unleash on your site to crawl and download every one of its pages for offline viewing. In both cases, both your site's bandwidth and resource usage are jacked up as a result, sometimes to the point of crashing your server. Bad bots typically ignore the wishes of your robots.txt file, so you'll want to ban them using means such as .htaccess. The trick is to identify a bad bot.

Below is a useful code block you can insert into.htaccess file for blocking a lot of the known bad bots and site rippers currently out there. It is derived from my reading of the excellent discussion "A close to perfect .htaccess file", specifically, "A close to perfect .htaccess file II." Simply add the below code to your .htaccess file:

RewriteEngine On 
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Bot mailto:craftbot@yahoo.com [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Download Demon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Express WebPictures [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] 
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] 
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^Image Stripper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Image Sucker [OR] 
RewriteCond %{HTTP_USER_AGENT} Indy Library [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Internet Ninja [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JOC Web Spider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] 
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mass Downloader [OR] 
RewriteCond %{HTTP_USER_AGENT} ^MIDown tool [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mister PiX [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Net Vampire [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Offline Explorer [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Offline Navigator [OR] 
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Papa Foto [OR] 
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] 
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] 
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Teleport Pro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web Image Collector [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web Sucker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebGo IS [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website eXtractor [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website Quester [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Xaldon WebSpider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Zeus 
RewriteRule ^.* - [F,L]

Bots that are listed above will all receive a 403 Forbidden error when trying to view your site. The amount of bandwidth savings and decrease in server resource usage as a result may be significant in many cases.

Change your default directory page

 

Some of you may be wondering, just what in the world is a DirectoryIndex? Well, grasshopper, this is a command which allows you to specify a file that is to be loaded as your default page whenever a directory or url request comes in, that does not specify a specific page. Tired of having yoursite.com/index.html come up when you go to yoursite.com? Want to change it to be yoursite.com/ILikePizzaSteve.html that comes up instead? No problem!

DirectoryIndex filename.html

This would cause filename.html to be treated as your default page, or default directory page. You can also append other filenames to it. You may want to have certain directories use a script as a default page. That's no problem too!

DirectoryIndex filename.html index.cgi index.pl default.htm

Placing the above command in your htaccess file will cause this to happen: When a user types in yoursite.com, your site will look for filename.html in your root directory (or any directory if you specify this in the global htaccess), and if it finds it, it will load that page as the default page. If it does not find filename.html, it will then look for index.cgi; if it finds that one, it will load it, if not, it will look for index.pl and the whole process repeats until it finds a file it can use. Basically, the list of files is read from left to right.

Every once in a while, I use this method for the following needs: Say I keep all my include files in a directory called include, and that I keep all my image files in a directory called images, I don't want people to be able to directory browse through them (even though we can prevent that through another htaccess trick, more later) I would specify a DirectoryIndex entry, in a specific htaccess file for those two directories, for /redirect/index.pl that is a redirect page (as explained here) that redirects a request for those directories to be sent to the homepage. Or I could just specify a directory index of index.pl and upload an index.pl file to each of those directories. Or I could just stick in an htaccess redirect page, which is our next subject!

Redirects

 

Ever go through the nightmare of changing significantly portions of your site, then having to deal with the problem of people finding their way from the old pages to the new? It can be nasty. There are different ways of redirecting pages, through http-equiv, javascript or any of the server-side languages. And then you can do it through htaccess, which is probably the most effective, considering the minimal amount of work required to do it.

htaccess uses redirect to look for any request for a specific page (or a non-specific location, though this can cause infinite loops) and if it finds that request, it forwards it to a new page you have specified:

Redirect /olddirectory/oldfile.html http://yoursite.com/newdirectory/newfile.html

Note that there are 3 parts to that, which should all be on one line : the Redirect command, the location of the file/directory you want redirected relative to the root of your site (/olddirectory/oldfile.html = yoursite.com/olddirectory/oldfile.html) and the full URL of the location you want that request sent to. Each of the 3 is separated by a single space, but all on one line. You can also redirect an entire directory by simple using Redirect /olddirectory http://yoursite.com/newdirectory/

Using this method, you can redirect any number of pages no matter what you do to your directory structure. It is the fastest method that is a global affect.

Prevent viewing of .htaccess file

 

If you use htaccess for password protection, then the location containing all of your password information is plainly available through the htaccess file. If you have set incorrect permissions or if your server is not as secure as it could be, a browser has the potential to view an htaccess file through a standard web interface and thus compromise your site/server. This, of course, would be a bad thing. However, it is possible to prevent an htaccess file from being viewed in this manner:

<Files .htaccess>
order allow,deny
deny from all
</Files>

The first line specifies that the file named .htaccess is having this rule applied to it. You could use this for other purposes as well if you get creative enough.

If you use this in your htaccess file, a person trying to see that file would get returned (under most server configurations) a 403 error code. You can also set permissions for your htaccess file via CHMOD, which would also prevent this from happening, as an added measure of security: 644 or RW-R--R--

Adding MIME Types

 

What if your server wasn't set up to deliver certain file types properly? A common occurrence with MP3 or even SWF files. Simple enough to fix:

AddType application/x-shockwave-flash swf

AddType is specifying that you are adding a MIME type. The application string is the actual parameter of the MIME you are adding, and the final little bit is the default extension for the MIME type you just added, in our example this is swf for ShockWave File.

By the way, here's a neat little trick that few know about, but you get to be part of the club since JavaScript Kit loves you: To force a file to be downloaded, via the Save As browser feature, you can simply set a MIME type to application/octet-stream and that immediately prompts you for the download. I have no idea how that would be useful, but that question has come up in our Forums from time to time, so there ya' go.

 

Preventing hot linking of images and other file types

Note: This portion of tutorial written by JavaScript Kit

 

In the webmaster community, "hot linking" is a curse phrase. Also known as "bandwidth stealing" by the angry site owner,  it refers to linking directly to non-html objects not on one own's server, such as images, .js files etc. The victim's server in this case is robbed of bandwidth (and in turn money) as the violator enjoys showing content without having to pay for its deliverance. The most common practice of hot linking pertains to another site's images.

Using .htaccess, you can disallow hot linking on your server, so those attempting to link to an image or CSS file on your site, for example, is either blocked (failed request, such as a broken image) or served a different content (ie: an image of an angry man) . Note that mod_rewrite needs to be enabled on your server in order for this aspect of .htaccess to work. Inquire your web host regarding this.

With all the pieces in place, here's how to disable hot linking of certain file types on your site, in the case below, images, JavaScript (js) and CSS (css) files on your site. Simply add the below code to your .htaccess file, and upload the file either to your root directory, or a particular subdirectory to localize the effect to just one section of your site:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www.)?mydomain.com/.*$ [NC]
RewriteRule .(gif|jpg|js|css)$ - [F]

Be sure to replace "mydomain.com" with your own. The above code creates a failed request when hot linking of the specified file types occurs. In the case of images, a broken image is shown instead.

Serving alternate content when hot linking is detected

You can set up your .htaccess file to actually serve up different content when hot linking occurs. This is more commonly done with images, such as serving up an Angry Man image in place of the hot linked one. The code for this is:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www.)?mydomain.com/.*$ [NC]
RewriteRule .(gif|jpg)$ http://www.mydomain.com/angryman.gif [R,L]

Same deal- replace mydomain.com with your own, plus angryman.gif.

Time to pour a bucket of cold water on hot linking!

Preventing Directory Listing

 

Do you have a directory full of images or zips that you do not want people to be able to browse through? Typically a server is setup to prevent directory listing, but sometimes they are not. If not, become self-sufficient and fix it yourself:

IndexIgnore *

The * is a wildcard that matches all files, so if you stick that line into an htaccess file in your images directory, nothing in that directory will be allowed to be listed.

On the other hand, what if you did want the directory contents to be listed, but only if they were HTML pages and not images? Simple says I:

IndexIgnore *.gif *.jpg

This would return a list of all files not ending in .jpg or .gif, but would still list .txt, .html, etc.

And conversely, if your server is setup to prevent directory listing, but you want to list the directories by default, you could simply throw this into an htaccess file the directory you want displayed:

Options +Indexes

If you do use this option, be very careful that you do not put any unintentional or compromising files in this directory. And if you guessed it by the plus sign before Indexes, you can throw in a minus sign (Options -Indexes) to prevent directory listing entirely--this is typical of most server setups and is usually configured elsewhere in the apache server, but can be overridden through htaccess.

If you really want to be tricky, using the +Indexes option, you can include a default description for the directory listing that is displayed when you use it by placing a file called HEADER in the same directory. The contents of this file will be printed out before the list of directory contents is listed. You can also specify a footer, though it is called README, by placing it in the same directory as the HEADER. The README file is printed out after the directory listing is printed.

Conclusion & More Information

 

Of course, I can't list every possible use of htaccess here, just the more notable and useful ones (read: for fun and profit). There is a list of Apache Directives you can use for your htaccess files, though not all of them are designed to be used by htaccess. Consult the documentation for the directive you are looking to use and make sure that you can actually use it as an htaccess string.

You should also go through the Apache User's Guide for more detailed information if you are really serious about making your life easier as a webmaster. You don't need to update all 4,000 of the pages on your site individually, by hand, in order to change one file reference...honestly!

In any event, I hope you got a better idea of the power available to you through this relatively simple little Clark Kent-ish file. You really do have the ability to save yourself a lot of time and grief by using htaccess, especially when you add to that the power of SSI and xSSI.