How to make an URL Shortener (with code in PHP and MySQL database)


Considering the number of URL Shortening services, I decided to make a short review of the subject. URL Shorteners were made popular by micro-blog services which used limited number of characters for a message e.g. Twitter.

So, here’s what’s covered:

 
Update (2011-07-29):
Reorganized code. It’s now divided into classes, easier for understanding and reuse. Using PDO for database connection instead of mysql methods. Added QR Code for generated shortlinks. This tutorial won’t change anymore, since it’s on the basic level it needs to be. For more information, you can follow the URL Shortening project I started, named BooRL.

URL Redirection

There are many uses for URL redirection. Whether it is because of moving to a new domain, logging links, manipulating search engines or just buying similar domain names in case of typos. It is important to have SEO (Search Engine Optimization) in mind when doing this. There are several ways for doing it:

Javascript Redirect

You can redirect visitors using Javascript code within HTML code. Search engines don’t like this method. Imagine you have a indexed page about something. It would be easy to add the javascript code later to redirect to another website (Viagra, Casino, Poker…).

<script type="text/javascript">
                  window.location = "http://www.google.com";
</script>

Parked Domains

You could park an additional domain, and point it’s DNS server to your original site hosting. Whenever someone types the additional domain, your main site is opened. This is also not recommended, because search engines penalize duplicate content. Another bad thing is that every domain will get it’s own page ranking, instead of ranking only the original website.

HTTP Redirects

In the HTTP protocol used by the World Wide Web, a redirect is a response with a status code beginning with 3 that induces a browser to go to another location.

  • 300 multiple choices (e.g. offer different languages)
  • 301 moved permanently
  • 302 found (originally temporary redirect, but now commonly used to specify redirection for unspecified reason)
  • 303 see other (e.g. for results of cgi-scripts)
  • 304 not modified
  • 305 use proxy
  • 307 temporary redirect

Out of these, 301 and 302 are commonly used. Well, 302 is (at least what I know) handled well by Google, which means it will transfer your page ranking to the redirected page, but other search engines won’t be so nice.
So, here we are, left with the only right way in SEO perspective, the 301 redirect.

Model

We’re gonna use HTTP 301 redirect. Let’s discuss the model. Every URL Shortener service has a domain and a key. For example http://goo.gl/4uHOQ has:

  • domain: goo.gl
  • key: 4uHOQ

The key is different for every URL. If you are making a simple URL redirection service, you can return the same key for the same long URL. If you want to add URL logging for each user, with some analytics, this won’t work, but we’re gonna stick to the simple model, for the sake of understanding.

We’re gonna store each URL in a database. We need one (database) table:

idlong_url
1http://www.codeden.net/
2http://www.codeden.net/2011/05/skype-login-window-disappears/
3http://www.codeden.net/2011/05/extracting-pixel-values-from-videos-using-opencv/

Now we have a not-so-bad mapping of long URLs into numbers. This way, we can represent, using 8 character key, 108 long urls.. It is clear that this isn’t the best choice, and that a good URL shortening service would hit the cap in only a few days. But let’s say we map id to another system of enumeration. The id is in a system with base 10, which means we can use 10 numbers in 1 spot. But, if we can somehow map it into a system with a larger base e.g. 64, we’re gonna have 64 numbers in one spot, and the total number of length 8 keys is 648.

Implementation

Let’s put all this together using PHP. First, we need a way to do a 301 redirect:

header('HTTP/1.1 301 Moved Permanently');
header('Location: ' . $url);
exit();

where $url is the webpage we are redirecting to.

Then we need a number converter (more a mapping) I posted a few weeks earlier.

We’re going to extend our database table, and keep the mapped value together with the long URL instead of decoding it to decimal and looking for it since storage space is a lot cheaper than processing power.

idshort_urllong_url
1Ahttp://www.codeden.net/
2Bhttp://www.codeden.net/2011/05/skype-login-window-disappears/
3Chttp://www.codeden.net/2011/05/extracting-pixel-values-from-videos-using-opencv/
.........
129231eiOhttp://www.codeden.net/category/random/

SQL for the database:

CREATE DATABASE IF NOT EXISTS `shortener`;

CREATE  TABLE IF NOT EXISTS `shortener`.`mapping` (
  `id` INT NOT NULL AUTO_INCREMENT ,
  `short_code` VARCHAR(10) NOT NULL ,
  `long_url` TEXT NOT NULL ,
  `insert_date` DATETIME NULL ,
  PRIMARY KEY (`id`) ,
  INDEX `short_code` (`short_code` ASC) ,
  INDEX `long_url` (`long_url`(20) ASC) )
ENGINE = MyISAM
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_general_ci;

Indexes on short_code and long_url are for faster select queries. MyISAM is chosen because there are no referential integrities and it is fast.

When someone goes to www.sho.rt/someKey we should lookup in the database for short_code: someKey, and load the long_url into $url in the PHP redirect page.
The easiest way is to redirect everything to a single page, which will extract the key, ask the database for long_url and run the redirect.

Already wrote how to set up Apache HTTP server and mod_rewrite (which comes bundled with it) and redirect all requests to a single page.

URL Shortener class:

// Number converter
include('converter.php');

/**
 * URL Shortener class
 *
 */
class Shortener {

	// Database holder
	private $database;
	// Short Code Regular Expression
	private $keyRegex = "/[^A-Za-z0-9\+\=]/";
	// URL Regular Expression
	private $urlRegex = '/^(https?|ftp):\/\/[A-Za-z0-9_\-]+(\.[A-Za-z0-9_\-]+)+(\/|(\/[A-Za-z0-9_\-\?\+\=\&\.]+)+)?\/?$/';

	// Connect to database on construction
	public function __construct() {
		$this->database = new PDO('mysql:host=localhost;dbname=shortener', 'root', '');
	}

	/**
	 * Get long URL for given key
	 * 
	 * @param key - the long URL short code
	 * @return Long URL
	 */
	public function getLongURL($key) {
		// Validate the key to contain only characters used in the mapping
		if (preg_match($this->keyRegex, $key)) {
			throw new Exception("Key contains chracters that are not allowed!");
		}

		// Search for the key in the database
		$result = $this->database->query("SELECT long_url FROM mapping WHERE BINARY short_code = '" . $key . "'")->fetchAll();

		if (sizeof($result) == 1) {
			$url = $result[0]['long_url'];
		} else {
			throw new Exception("Key invalid!");
		}

		return $url;
	}

	/**
	 * Shortens an URL if it doesn't exist, otherwise returns short code
	 * 
	 * @param url - URL to be shortened
	 * @return Short Code for given URL
	 */
	public function insertNewURL($url) {
		// Check if form was submitted and add the URL to the database if it doesn't exist,
		// otherwise return the shortcode of the long_url

		// Validate entered URL
		if (!preg_match($this->urlRegex, $url)) {
			throw new Exception('You have entered an invalid URL.');
		}
		/*
		 * This is a potential pitfall, since it can be misused
		 * for attacking a HTTP server (Denial-of-Service)
		 */
		// Check if the URL exists
		if (!get_headers($url)) {
			throw new Exception('URL doesn\'t exist.');
		}

		$url = addslashes($url);
		$this->database->beginTransaction();
		// Search for the url in the database
		$result = $this->database->query("SELECT short_code FROM mapping WHERE " .
			" BINARY long_url = '" . $url . "'")->fetchAll();

		// If found, return short url
		if (sizeof($result) == 1) {
			$short = $result[0]['short_code'];
		} else {
			// Create a new short_code for the given URL
			$result = $this->database->query("SELECT COALESCE(MAX(id + 1), 1) FROM mapping")->fetchAll();
			if (sizeof($result) == 1) {
				$id = intval($result[0][0]);
			} else {
				$this->database->rollback();
				throw new Exception('Can\'t get id.');
			}

			// Convert the ID to a new base
			$short = NumberConverter::fromDecimalToBase($id, 64);

			// Insert the new URL data into the database
			$result = $this->database->exec("INSERT INTO mapping (id, short_code, long_url, insert_date)" .
					" VALUES ($id, '$short', '$url', CURDATE())");

			if ($result) {
				$this->database->commit();
			} else {
				$this->database->rollBack();
				throw new Exception('Could not add the URL.');
			}
		}

		return $short;
	}

}

Not to forget that also an interface for adding new URLs to the database is needed. On top of it, show QR codes for shortened URLs using the QR class.

Here’s the PHP for index.php put together:


// URL Shortener class
include("shortener.php");
// QR Class
include("qr.php");

// Instantiate Shortener
$shortener = new Shortener();

// Extract the key
$key = split("/", $_SERVER['REQUEST_URI']);
$key = $key[sizeof($key) - 1];

try {
	// If there is a key supplied, try to find it in the database,
	// otherwise show the page for shortening URLs
	if (strlen($key) > 0) {
		// Get Long URL for the given key
		$url = $shortener->getLongURL($key);
		// Redirect
		header('HTTP/1.1 301 Moved Permanently');
		header('Location: ' . $url);
		exit();
	} else {
		// Show a simple form with URL field and a button
		echo '<html>';
		echo '<head>';
			echo '<title>URL Shortener</title>';
		echo '</head>';
		echo '<body>';
			echo '<form name="input" action="" method="post">';
				echo 'URL: <input type="text" name="url" /> <input type="submit"';
					echo 'value="Shorten" />';
			echo '</form>';
		
		$url = $_POST['url'];
		// Check if form was submitted and add the URL to the database if it doesn't exist,
		// otherwise return the shortcode of the long_url
		if (isset($url) && strlen($url) > 0) {
			// Get domain
			$domain = $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];
			// Create new short code or get old if it already exists
			$shortCodeURL = $domain . $shortener->insertNewURL($url);
			// Show the shortcode
			echo $shortCodeURL . '<br />';
			// Get QR Image for the generated code
			$qr = QR::getQRforURL($shortCodeURL, 200);
			// Show image
			echo '<img src="' . $qr . '" alt="QR" />';
		}
		
		echo '</body>';
		echo '</html>';
	}
} catch (Exception $e) {
	// Catch exceptions if any arise and show a message
	echo "Error! " . $e->getMessage();
	die();		
}

Pitfalls and misuse

As said earlier, it is a very simple shortener. Here are a few things to have in mind if you are going to make a commercial shortener:

  • Get a short domain, not over 5-6 characters including the dot (.)
  • If using PHP, have in mind that the maximum integer value is 2147483647 or 2*2147483647 in case of unsigned integer
  • Need a more complex database (users, link analytics)
  • A way to deal with decayed links i.e. links that haven’t been used for some time

Misuse

When URL Shorteners appeared they were often used to disguise an underlying address. You have no idea where a short link leads until you click it. Popular services like Facebook and Digg added a prefetch operation which shows some of the links contents, so you can’t get fooled.
 


 
That’s all and thanks for reading folks.

Tagged , , ,

15 thoughts on “How to make an URL Shortener (with code in PHP and MySQL database)

  1. outis says:

    The mysql extension is out of date and on its way to deprecation. PDO with prepared statements can be easier, safer and more performant.

    The singleton pattern isn’t advantageous here and complicates unit testing.

  2. Hello There. I found your blog using msn. This is an extremely well written article. I will be sure to bookmark it and return to read more of your useful info. Thanks for the post. I will definitely comeback.

  3. Nancy says:

    Thanks for the share!
    Nancy.R

  4. Noel Divito says:

    Great subject matter. I’ve discovered a good deal something totally new the following. Keep going.

  5. Creeregom says:

    Nice post 🙂 Foken refs

  6. Useful thoughts will put these into practice now.

  7. Jenae Galla says:

    Great feature, I genuinely benefited from reading it, keep doing all the good thoughts.

  8. Zora Kirklin says:

    My wife could have been searching for this particular almost everywhere. Thanks.

  9. Charity Pitocco says:

    I have been trying to find this almost everywhere. Many thanks.

  10. Miriam Cevera says:

    Pleasant article, I actually had a good time studying it, keep doing all the good efforts.

  11. Todd Basset says:

    Great post. I just stumbled upon your blog and wanted to say that I have really enjoyed browsing your blog posts. In any case I’ll be subscribing to your feed and I hope you write again soon!

  12. Tari Zurita says:

    Appreciate it for this post, I am a big big fan of this web site would like to go along updated.

  13. Awesome writing style!

  14. Romka says:

    I’ve created http://url-shortener.io Thanks, great article!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.