Google PageRank PHP Class

It appears that Google has changed their Page Rank mechanism. I'm currently investigating ways to restore the functionality of this class.

Google PageRank Checker

While developers and designers can debate about the important of different search engine optimizations strategies, one metric that simply can't be argued is a website's Google PageRank, or its importance in driving traffic to the site.  Achieving a better PageRank was a consideration when redesigning this blog.  We can discuss how to achieve a better PageRank in another post -- this post will focus on how you can retrieve a page's Google PageRank using a small PHP class that I have created.

The PHP

The base functions within this class were created by Jamie Scott -- all credit for the base functions go to him.  I've simply placed the functionality into PHP class format for easy use and updated the code to be a bit more transparent.  As far as PHP classes go, this one is quite small:

// Declare the class
class GooglePageRankChecker {
	
	// Track the instance
	private static $instance;
	
	// Constructor
	function getRank($page) {
		// Create the instance, if one isn't created yet
		if(!isset(self::$instance)) {
			self::$instance = new self();
		}
		// Return the result
		return self::$instance->check($page);
	}
	
	
	// Convert string to a number
	function stringToNumber($string,$check,$magic) {
		$int32 = 4294967296;  // 2^32
	    $length = strlen($string);
	    for ($i = 0; $i < $length; $i++) {
	        $check *= $magic; 	
	        //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31), 
	        //  the result of converting to integer is undefined
	        //  refer to http://www.php.net/manual/en/language.types.integer.php
	        if($check >= $int32) {
	            $check = ($check - $int32 * (int) ($check / $int32));
	            //if the check less than -2^31
	            $check = ($check < -($int32 / 2)) ? ($check + $int32) : $check;
	        }
	        $check += ord($string{$i}); 
	    }
	    return $check;
	}
	
	// Create a url hash
	function createHash($string) {
		$check1 = $this->stringToNumber($string, 0x1505, 0x21);
	    $check2 = $this->stringToNumber($string, 0, 0x1003F);
	
		$factor = 4;
		$halfFactor = $factor/2;

	    $check1 >>= $halfFactor;
	    $check1 = (($check1 >> $factor) & 0x3FFFFC0 ) | ($check1 & 0x3F);
	    $check1 = (($check1 >> $factor) & 0x3FFC00 ) | ($check1 & 0x3FF);
	    $check1 = (($check1 >> $factor) & 0x3C000 ) | ($check1 & 0x3FFF);	

	    $calc1 = (((($check1 & 0x3C0) << $factor) | ($check1 & 0x3C)) << $halfFactor ) | ($check2 & 0xF0F );
	    $calc2 = (((($check1 & 0xFFFFC000) << $factor) | ($check1 & 0x3C00)) << 0xA) | ($check2 & 0xF0F0000 );

	    return ($calc1 | $calc2);
	}
	
	// Create checksum for hash
	function checkHash($hashNumber)
	{
	    $check = 0;
		$flag = 0;

		$hashString = sprintf('%u', $hashNumber) ;
		$length = strlen($hashString);

		for ($i = $length - 1;  $i >= 0;  $i --) {
			$r = $hashString{$i};
			if(1 === ($flag % 2)) {			  
				$r += $r;	 
				$r = (int)($r / 10) + ($r % 10);
			}
			$check += $r;
			$flag ++;	
		}

		$check %= 10;
		if(0 !== $check) {
			$check = 10 - $check;
			if(1 === ($flag % 2) ) {
				if(1 === ($check % 2)) {
					$check += 9;
				}
				$check >>= 1;
			}
		}

		return '7'.$check.$hashString;
	}
	
	function check($page) {

		// Open a socket to the toolbarqueries address, used by Google Toolbar
		$socket = fsockopen("toolbarqueries.google.com", 80, $errno, $errstr, 30);

		// If a connection can be established
		if($socket) {
			// Prep socket headers
			$out = "GET /search?client=navclient-auto&ch=".$this->checkHash($this->createHash($page))."&features=Rank&q=info:".$page."&num=100&filter=0 HTTP/1.1\r\n";
			$out .= "Host: toolbarqueries.google.com\r\n";
			$out .= "User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 2.0.114-big; Windows XP 5.1)\r\n";
			$out .= "Connection: Close\r\n\r\n";

			// Write settings to the socket
			fwrite($socket, $out);

			// When a response is received...
			$result = "";
			while(!feof($socket)) {
				$data = fgets($socket, 128);
				$pos = strpos($data, "Rank_");
				if($pos !== false){
					$pagerank = substr($data, $pos + 9);
					$result += $pagerank;
				}
			}
			// Close the connection
			fclose($socket);
			
			// Return the rank!
			return $result;
		}
	}
}

The createHash and checkHash methods perform the deep down mathematical operations.  Once those are out of the way, the check method connects to Google's toolbar server, disguising itself as a toolbar via the User-Agent header, to get the page's PageRank.  A singleton pattern is used since creating individual instances isn't important:

$rank = GooglePageRankChecker::getRank("davidwalsh.name"); // returns "5"

The number provided back represents the PageRank for the URL provided!  This PHP class can be used on its own, but I've created a MooTools-powered script to retrieve an address' PageRank via some simple AJAX.

The MooTools JavaScript

This MooTools quick inline MooTools script responds to a button click, making an AJAX call to a PHP script that runs the class provided above:

// When the DOM is ready
window.addEvent("domready",function() {
	
	// When the form is submitted...
	var form = document.id("rankForm"), request, display, domain;
	form.addEvent("submit",function(e) {
		// Stop the event
		if(e) e.stop();
		
		// Create request, if not already created
		if(!request) {
			domain = document.id("domain");
			display = document.id("rankerDisplay");
			request = new Request({
				url: "pagerank-checker.php", 
				method: "post",
				onComplete: function(response) {
					display.setStyle("display","block").set("text","Page rank for " + domainValue + " is: " + response);
				}
			});
		}
		
		// Get the value fo the URL
		domainValue = domain.get("value");
		
		// Send the request
		request.send({ data: { domain: domainValue } });
	});
	
});

Using a JavaScript snippet like this, you could easily add a JavaScript-fronted Google PageRank checker with the framework of your choosing.

Outstanding work by Jamie Scott in creating the base functions to retrieve a page's Google PageRank with PHP.  Hopefully my class makes the PageRank code a bit more portable and recognizable.


Comments

  1. Brad

    This would be perfect for a CMS with SEO tools. You should make a WordPress plugin out of it!

    • BlaineSch

      I’m not sure a wordpress plugin would be good for this. Maybe modify a current SEO plugin to show your current pagerank, which would be a very minor addition, and I’m sure already exists.

  2. peter

    A download would be really helpful for those us us or are still learning….

    tried using the class and it didn’t work so gone away frustrated and realising that I just wasted my time on your site, and that not good for me or you

    • David Walsh

      You’ve not described the error in any way. How can anyone help you? Did you not put the php code in tags?

      I didn’t offer a download because the source is all there…as it is with every post.

    • peter

      Yes the copy button did not work for me, but I just selected copied and pasted the code.

      So the code is there but them you have to add the PHP and script tags, the link to the mootools library, the form to submit the URL… we are talking about recreating the demo, or there is not point in having a demo if it cant be recreated..

      and I can’t recreate it. Sorry if I am not a good enough coder, that’s why I’m reading your tutorials…

    • The Frosty

      If you used the copy method from the code all the HTML gets encoded. So you need to manually copy or re-encode the HTML.

      ie. < should be <

    • BlaineSch

      Using Firefox 4.0, the “copy” button does not even work.

  3. The Frosty

    This is awesome, putting it to use.

  4. Ondřej Švec

    Pretty nice class! I am just curious: Why do you initialize the instance at all? Wouldn’t be better to just set all functions to static and use it the way without initialization of the class?

  5. joe

    This is pretty cool (as is the rest of your site) and I appreciate you sharing the code but I’m not quite sure I see the point. There are plenty of tools already out there for checking “toolbar” PageRank (not to be confused with *actual* PageRank). I use a firefox extension called Quirk SearchStatus that automatically shows the PR (and other metrics) of whichever page I’m on. seoforfirefox is another great plugin I use that can do a lookup of an individual page as well as list metrics of sites in SERPs.

    As for not being able to argue the importance of PR in driving traffic or having any real bearing on SEO whatsoever, I’d have to disagree with you. PR hasn’t been a meaningful factor for quite some time now and it’s not uncommon to see PR 0 sites whipping the pants off sites with much higher PR scores in the rankings.

  6. Victor Bolshov

    just a few notes on PHP code.

    1. $c = __CLASS__;
    self::$instance = new $c;

    you could write “self::$instance = new self();” instead.

    2. stringToNumber() is quite useless on a 64-bit machine

    3. function check($domain) could make use of PHP’s file_get_contents() and stream context (see http://php.net/file_get_contents, Example #4 Using stream contexts). Cheers!

    • David Walsh

      Thank you for the improvement ideas. Could you explain what you mean by the second item, for the sake of others?

      Thank you!

    • Victor Bolshov

      In fact, it seems the code might not work properly even on 32bit system. 2147483647 is the max int in 32bit PHP (we don’t have unsigned ints here ;) ). On a 64bit system the limit is high enough not to take care in most cases.

  7. kyushudan

    Very cool. I’m in shocked disbelief with the result it spits back out after entering my domain. The ego took a beating. Guess I gotta build a bridge…

  8. Konstantin

    Hi David, you mention “this post will focus on how you can retrieve a domain’s Google PageRank” but please don’t forget that PageRank is assigned to a page, not a domain. Other than that, great post, thanks!

    ~ @kovshenin

  9. Aslam Doctor

    Awsome Article. I will use this in my code library. Thanks Mr.Walsh :)

  10. aykak

    Wow awsome Tuts will use for future projects.. :)

  11. Kamil Skrzypiński

    Thanks for share. I use code from Luc De Brouwer ( http://www.lucdebrouwer.nl/using-php-to-retrieve-the-google-pagerank-of-any-domain/ ) adapted by me to CodeIgniter Library…

    It’s effective solution but sometimes it’s not working – Google respond with error page. So if you want to use this code in production, remember about occasional limitations.

  12. Leando

    I’m getting all the websites with PR5… Is it a bug or something?

  13. SirChunk

    Hey David,
    On your demo, how do you get around it if too many requests are made as surely G would ban the server IP?

  14. Mashary

    this code work for me
    on my local host with PHP 5.2

    thanks a lot david

  15. Mashary

    what happen if host google toolbar does not responding?

  16. Maciej Mikulski

    Few days ago Google changed query phrase from “search?” to “tbr?”
    so the script stopped working.
    To bring it back you should change mentioned phrase from:
    $out = "GET /search?client=navclient-auto&ch=".$this->checkHash
    to:
    $out = "GET /tbr?client=navclient-auto&ch=".$this->checkHash

    @David: thanks for the script.

  17. Jürgen

    Thx to Maciej Mikulski for the information. Saved me a lot of time.

  18. James Wade

    Hi Guys,

    You’ll find an update version of the code here: https://github.com/phurix/pagerank

    Hope this helps!

  19. Thirupathi

    Hello.. I have tried with the code above. But i could not get the pagerank. Can you please help to get the pagerank. I need it . I have seen in your demo also but could not get it. Please help me urgent.

    Thanks,
    Thiru :)

  20. Nitin Gupta

    Tried this script but not working, could you pl. update the script and share code base.


Be Heard!

Share your thoughts without being a jerk! And wrap your code in <code> tags, f00!

Name*:
Email*:
Website: