Short URLs are all the rage. Everybody knows how they work — they might have some favourites too — so I won’t get into the whole “if you’ve been living under a rock” routine. If you have been living under a rock, Wikipedia explains the concept pretty nicely. It’s not such a big deal, but Twitter makes it so. And since we all Twitter, it becomes a very important thing.⌘
The recent drama involving Tr.im and their little tantrum brought out a lot of people wildly swinging their arms at the whole concept of URL shortening and the new Web’s dependence on these link hijacking centralised services. They are right in a way. Whenever a shortening service (regardless of popularity, as long as it is being used) goes down, it takes down a part of the web with itself. Suddenly a lot of links are useless. It’s impossible to find all these dead links and bring them back to life in some way or the other. Panayotis Vryonis had a nice analogy for it:⌘
Emphasis mine — Ed.⌘
URL shorteners are just lossless information compressors. Much like a zip function compresses information … No information is lost, but we need more resources to decompress and use it than in the original state.
There is one big difference between a function like zip and a URL shortener: the first one is based on an algorithm, all you need to extract the original information from a zipped file is knowing the zip algorithm. On the other hand, URL shorteners are using dictionaries: each URL shortening service is a dictionary that translates the short URL back to the original. If we don’t have access to this dictionary, the compressed information is useless.
Usually, I don’t link to anything rare — something that doesn’t have a chance of getting to linked to otherwise. But links back to me (anything related to my website) should remain active as long as the website remains active. I believe that URLs are a site’s identity, and there can only be one permalink for one thing. There might be other URLs, which point to that permalink, but everything will ultimately reach the same place. I wanted to make an individual short URL service for things related to me, which was under my control from the ground up so that if tomorrow services like Tr.im or Bit.ly go down, at least all short URLs related to me will still work. It would particularly suck to have links pointing to my site all die.⌘
If you’re going to use short URLs, it’ll be a good idea to save the Internet with rev=”canonical”⌘
So I set about making one. It took me about 6 hours, which involved deciding on and registering a new domain name as well. Since it was an interesting exercise, and I learnt quite a deal about how URLs really work, I thought of sharing my scripts and experiences here for someone else to benefit from them. While I’m not saying that we all should have our own custom URL shortening services, I am saying that vital things related to you should remain in reliable hands (read: yours). No-one is more reliable than yourself.⌘
Let’s get shorty
There are many ways of making this work, and a simple search will reveal readymade scripts that you just need to put on your server and you’re good to go. They handle attempts to abuse your system as well, so they’re definitely more secure than my solution. But since this is intended only for my use, I won’t be telling you where you’ll find my shortener (and I hope you won’t guess it either). I will share my scripts though, and point out some interesting bits that I came across.⌘
My solution is written entirely in PHP since it was the fastest way to get this done. Using some simple .htaccess rewrite rules, I handle the redirects. You can grab all the important files from the Shorty Repository1 on GitHub. Everything is under a modified MIT license. With a few changes, you’ll be ready to go. But you should keep reading to understand why I did some things the way I did, and if you find a fix, you can let me know.⌘
The main magic happens in shorty.php that generates the short URL for a given longer one. It also accounts for custom “vanity” URLs, in case you want one which is easier to remember. Like http://⌘am.ws/aditya points to my homepage. It also handles duplicates, in that it will return the short URL if you try to shorten one that has already been reduced. Saves the database from unnecessarily overflowing with redundant mappings.⌘
if(!isset($_GET['url']) || trim($_GET['url']) == "")
die('Nothing to shorten.');
$host = parse_url($_GET['url']);
if(in_array($host['host'], array("tr.im", "bit.ly", "tinyurl", "u.nu", "is.gd"))) # don't repeat yourself
die('Nothing to shorten.');
$u = "DB_USERNAME";
$p = "DB_PASSWORD";
mysql_connect('localhost', $u, $p);
mysql_select_db('DB_NAME');
# check if URL already exists
$url_exists = mysql_query("SELECT `short` FROM `mapping` WHERE `long` LIKE '". strip_tags(urlencode($_GET['url'])) ."'");
if(mysql_num_rows($url_exists) > 0){
$n_row = mysql_result($url_exists, 0, 'short');
} else {
if($_GET['vanity'] == ""):
$row = mysql_query("SELECT count(short) FROM `mapping` WHERE `short` REGEXP \"[[:digit:]]+\"");
$rows = mysql_result($row, 0, "count(short)");
$n_row = dechex($rows+1);
else:
$n_row = $_GET['vanity'];
endif;
$ins = "INSERT INTO `mapping` (`short`, `long`) VALUES ('$n_row', '". urlencode(preg_replace("/\/{2,}$/", "/", strip_tags($_GET['url'])))."');";
mysql_query($ins);
}
$url = "http://SHORT_DOMAIN/$n_row";
The table schema is pretty simple. Two fields named short and long, which hold the expected data. The code is pretty straightforward as well, though there are some interesting bits:⌘
- Line 4: I have taken into account the popular shortening services so that I don’t shorten something that has already been shortened. That way be dragons. I could have un-shortened the shortened URL and then re-shorten it, but read that sentence again and you’ll see why I didn’t.
- Line 19: For non-vanity URLs, I get the number of numeric URLs from the database by using
REGEXP "[[:digit:]]+"which just matches values that are digits. This way, the progression remains the smallest, otherwise we’d be reaching 100s pretty soon. I make up for the slowness ofREGEXP(which is hardly much) by usingcount()to just get a number, not all the rows. - Line 26:
strip_tagsandurlencodeshould be self-explanatory, but I also do apreg_replaceto remove any extra forward slashes from the end of shortened URLs, so that it doesn’t create problems when using.htaccessrules to redirect, since we use regular expressions there as well.
To make it work, just call shorty.php?vanity={vanity_word}&url={long_url}.⌘
The result of the script is a nice short URL that is stored in the $url variable for our use. I output it in a pre-selected text box so that I can copy it straightaway2. It’s pretty straightforward, because there is no click tracking or any metrics involved. I personally don’t care for that stuff, because my server logs tell me everything I need to know anyway.⌘
I thought of using pretty syntax to shorten URLs, but the regular expression for that combined with Keywurl made it quite painful to get right. There were issues with query strings in the URL to be shortened that were being stripped by mod_rewrite and other stupidities. So I just got rid of it, and stuck to directly calling the PHP script, which is what every other service does for its API.⌘
Putting short URLs to use
Now to the actual redirection. Here we need more than one script, since we’ll be using a short URL to point to a decryption PHP script that will redirect us to the proper place based on query strings that .htaccess will translate the short URL into. It’s less convoluted than it sounds. Although messing around with .htaccess isn’t a good idea if you’re new. Those “503 Internal Server Error”s are almost always caused by a misbehaving .htaccess. So unless you know what you’re doing, use the one I wrote — it works just fine, and is really small too:⌘
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.+)$ "?short=$1" [L]
The !-f is to make sure that I can still access files if I want to, and the RewriteBase is to avoid loops.⌘
The PHP script does a simple check against the database to locate the destination URL, and using a 301 redirect sends the user there. It’s pretty simple really. However, if you need to track clicks and other data, this is the script you should add it to.⌘
That’s about it. Not bad for something that looked complicated but got over pretty quick. Now I have short URLs for every essay here3, as well as my profiles on the various web services I use (like Flickr) that are too long to reference otherwise. I want to throw in a quick note about unicode characters in domain names. They don’t cost extra or anything, but if you want to use a unicode character in your domain name, your choice of TLDs will be limited. It took me quite a while to get the one I wanted, but I finally got it from iDotz.net. You might have to hunt around, but give these guys a go.⌘
I haven’t thought up all the ways I can use this, but it’s good to know I have the option.⌘
Download
Shorty — no guarantees, no warranties. This was a part time undertaking for my needs and purposes, so it most probably won’t fulfill yours. If you do find it useful, more power to you. Released under a modified MIT license.⌘
-
After writing this, I discovered another Shorty, but since this will probably never be as popular as that, I don’t plan to change the name. ↩
-
I wish there was a cross-browser way to copy things to the clipboard using Javascript. I found Flash-based solutions, but they were all broken with Flash 10. Most of them didn’t work with Safari at all. ↩
-
The “⌘” symbol is the link. These are not stored in the database, since they directly take you to Wordpress, after which the internal redirects take over. Sure, it is 2 redirects, but it’s not that bad. ↩

