Imitation of files and directories
The address of your site appears on the user’s screen simultaneously with design and content. Thus the address is a full member part of a site. Address like www.firm.com (www.firm.city.com) is much better than something like www.geocities.com/Gonduras/San-Pedrillio/~our_firm – nobody doubts it. And concerning the question of clear addresses inside of a site the community hasn’t come to the obvious agreement yet.
However it would be much more pleasant for a user to see an address like /services/special/ than the one like /content.phtml?q=e23908a234cc239b3445127.
Lyrical digression. I remember being showed a flash-reel of Hewllett Packard Laser Jet 3100 on Internity ’99. In a couple of weeks I remembered about it and decided to download it from home. I could be wandering for a long time over the Lexsmark site in vain search if there would be other addresses. On HP the addresses were clear – something like "/products/printers/laserjet/3100" and on the Lexmark’s site there was something like "q=492898748273". I was hesitating but remembered that it was HP.
By the way, all addresses of issues, print versions and all the information pages (such as links, files etc.) are virtual, there are no files with such titles.
It is done simply enough. In .htaccess file strings are written, for example
ErrorDocument 404 all.php
ErrorDocument 403 all.php
ErrorDocument 401 all.php
File all.php handles variable $REQUEST_URI and if the necessary information is found it displays a command
header ("HTTP/1.0 200 Ok");
It is necessary to make browser IE 4 consider that the page is found and evade placing its service signboard ‘address not found’ instead of it. In the rest of cases even if address “all.php” is requested, a user will be shown a message that the file wasn’t found.
If by call of header function server displays "Error 500" – look here.
Of course, displaying a heading and drawing a page is simple. Watching requests’ results and errors assertion is rather a routine. The most important thing is examining of the address requested.
There is a plenty of methods here. For example, my print version and reference page are searched on regular expressions:
if (preg_match("/(d+)-comment/A", $url, $res)) ...
And then I extract the issue number from the variable $res[0] and check its presence within base. Address from some subdirectories can be obtained, for example, by means of explosion
$dir = explode("/", $url);
and then we handle substrings one-by-one (you need to delete slashes from the beginning and the end of a string before explosion).
And that is how I would make address handling.
if (preg_match("/([a-z]+)/(d{4})/(d{2})/(d{2})/([a-z]+)/A", $url, $match)) {
$request = "SELECT news_id FROM news, rub WHERE news.rub_id=rub.rub_id AND rub_address='". $match[1].
"' AND news_date LIKE '". $match[2]. "-". $match[3]. "-". $match[4]. "' AND news_address='". $match[5]. "'";
This request is done simply for checking if there is such news within the base. After that depending on result either a page with news is displayed or some warning (or the main page of a rubric/site).
And this is how I check addresses on this site:
if (preg_match("/(d+)-print/A", $url, $res)) {
// print version
}
elseif (preg_match("/(d+)-comment/A", $url, $res)) {
// all comments
}
elseif (!preg_match("/D/", $url)) {
// full issue version
}
else {
// either the rest of rubrics or address not found };
By the way, I display header("HTTP/1.0 200 Ok") only in case when issue/rubric is found.
Another example. Suppose the rubrics of a site are built as a tree and a table within base looks like this:
CREATE TABLE rubric (
id TINYINT NOT NULL AUTO_INCREMENT,
parent_id TINYINT,
address VARCHAR(16) NOT NULL,
title VARCHAR(128) NOT NULL,
rub_text TEXT NOT NULL,
PRIMARY KEY (id),
UNIQUE address (address)
);
There’s no need to explain what is title and rub_text. Field address is the address on which a rubric will be required (the news should be made into the first level rubric and there will be ‘news’ in the address field). Field parent_id is an identifier of a rubric one level higher.
Now if rubrics cannot be deliberately lower than the second level, the address analysis won’t be difficult.
$url = $REQUEST_URI;
// delete slashes from the beginning and the end of address
$url = ereg_replace("^/", "", $url);
$url = ereg_replace("/$", "", $url);
$dir = explode("/", $url)
// the case when the second level rubric is required.
if (sizeof($dir)==2) {
// a request is composed joining up rubric table with itself.
//Here it should be mentioned that table ‘first’ implies a second level rubric
//? second conversely of the first one.
$request = "SELECT first.id, first.title, first.rub_text FROM
rubric first, rubric second WHERE
first.parent_id=second.id AND first.address='". $dir[1]. "' AND
second.address='". $dir[0]. "'";
// We send a request to the base and then handle the result.
$result = mysql_query($request);
if (!mysql_error() && @mysql_num_rows($result)==1) {
}
// This is for the case when request is completed successfully but nothing was found
elseif (!mysql_error()) {
}
// ...and for the case when an error happened.
else
die ("Data base error. MySQL writes: ". mysql_error());
}
// The first level rubric is requested. There is nothing to do here.
elseif (!ereg("/", $url)) {
$request = "SELECT id, title, rub_text FROM rubric WHERE address='$url'";
...
};
Are there enough examples? Let’s sum up and describe in details advantages and drawbacks.
Advantages
- Nice addresses, ability to enter a rubric having typed its address on the keyboard.
- Gratitude of keyboard fans.
- Decrease of files’ number and number of repeated operations within different files.
- Centralized output. Collecting the most operations in one entry point.
- Hiding of some technological part of the site.
Drawbacks
- Increase of resources intensity due to address verification and compiling of one big file instead of some small ones.
- Difficulties with introduction of new parameters. You’ll have to throw something into cookies, for example.
- Something, for example, search will still remain outside the ‘entry point’ (although "How IT works" makes search in the address but for more or less compound site it will be inconvenient or impossible).
- Additional difficulties with images’ addresses and site navigation (as far as browser measures all the addresses relatively to a document open even if from a non-existent address).
Imitation of files and directories. Part 2
The only thing I’ve forgotten to describe in the previous issue is virtual file archives. Through ErrorDocument it’s quite possible to trace requests to a non-existing directory ‘download’ and put out the files requested from the base and the administrator of the site would be able to work with these archives through web-form. To put out a content-type correctly you need to take it when the administrator finishes it. In the content type table you need to create an optional field within which these types will be saved and then put them out in the heading. Of course, productivity of the server decreases.
Server searches file with the same name
It proves to be enough to record string Options Multiviews into directory settings (httpd.conf or .htaccess) or if directory Options already exists add MultiViews to it. Then if a user entries "
/* The first variant is when address like "/news/010120"is typed with possible shot at the end.
//Symbols ^ and $ here signify instance to the beginning and the end of a string.
Substring [0-9]{6}
//signifies 6 numbers (if your news may be dated with year 1999 or earlier
//use addresses with full year format and 8 numbers instead of 6). */
if (ereg("^/news/([0-9]{6})$", $REQUEST_URI, $match) ||
ereg("^/news/([0-9]{6})/$", $REQUEST_URI, $match)) {
}
/* the second variant – address is typed simply as "/news" or "/news/" */
elseif (ereg("^/news/$", $REQUEST_URI) || ereg("^/news$", $REQUEST_URI)) {
}
/* requests to all the other addresses (within this file) are treated as site breaking attempts */
else
die ("Error 404 Not found");
The same can be done, for example, with firm production catalogue – you may separate all this into a single file catalogue.php and make addresses like "/catalogue/rubric1/rubric2/rubric3". At this within file catalogue.php the beginning of a string the beginning of a string will be bitten and further you may work according to the principle described in the previous issue. The rest can be sent to ErrorDocument.
Server analyses request
This method is similar to the previous one but at the first sight it seems to be consuming fewer resources as far as there is no need in searching files over the directory.
In the directory settings (httpd.conf or .htaccess) we write:
<FilesMatch "^(news)$">
ForceType application/x-httpd-php
</FilesMatch>
Within directory is placed a file with name "news" (namely "news" without any extension). When address "/news" or "/news/bla-bla-bla" is requested the server accomplishes file news as a php-script. Inside of it handling of $REQUEST_URI is done.
To evade writing its own FilesMatch for each file like that you need to do some inconsiderable changes within template’s string. Make the server search files without extension i.e. those which have no dot in their names:
<FilesMatch "^([^\.]+)$">
Rather convenient! Some day I’ll install something like this myself.
Server rewrites requests
mod_rewrite is a rather useful thing. By its means you may do everything described above and many other things.
Unfortunately I’ve lost my fight against mod_rewrite. And so I describe obvious things.
First of all you need to discomment a string.
LoadModule mod_rewrite <path to module/name of a file>
in httpd.conf. In the directory configuration we write string "RewriteEngine On". Then — a command RewriteRule: RewriteRule <template> <substitution>
For example, RewriteRule ^(.*).html$ /otherdir/$1.html (everything without inverted commas). That’s all.
And one more example:
RewriteEngine On
RewriteRule ^(.*).htm$ /portal/$1
<FilesMatch "(portal)$">
ForceType application/x-httpd-php
</FilesMatch>
There one file named ‘portal’ is placed into which all the requests to html-files within the directory given are redirected. And all the files seem to be placed there.
And at last I’d like to examine the possibility for realization of dynamical addressing through PHP. The addresses’ scheme is following <rubric>/<year>/<month>/<day>/<news>. If we cut off day, month and year, we’ll get the latest rubric’s stuff.
RewriteRule ^([a-z]+)/$ rubic_last/$1
RewriteRule ^([a-z]+)/([0-9]{4})/([0-9]{2})/([0-9]{2})/$ rubric_date/$1/$2-$3-$4
RewriteRule ^([a-z]+)/([0-9]{4})/([0-9]{2})/([0-9]{2})/([a-z]+)/$ rubric_news/$1/$2-$3-$4/$5
<FilesMatch "^rubric">
ForceType application/x-httpd-php
</FilesMatch>
The advantages of this method compared with common entry point of Error Document are obvious: PHP-runner interprets only that which is going to be accomplished. There’s nothing like switch/case or if/elseif/else and no spare strings.
All the rest we transmit to ErrorDocument “as in the previous task”.



