Treehouse

Google PageRank PHP Class

By on  

It appears that Google has changed their Page Rank mechanism. I'm currently investigating ways to restore the functionality of this class.

Google PageRank Checker

While developers and designers can debate about the important of different search engine optimizations strategies, one metric that simply can't be argued is a website's Google PageRank, or its importance in driving traffic to the site.  Achieving a better PageRank was a consideration when redesigning this blog.  We can discuss how to achieve a better PageRank in another post -- this post will focus on how you can retrieve a page's Google PageRank using a small PHP class that I have created.

The PHP

The base functions within this class were created by Jamie Scott -- all credit for the base functions go to him.  I've simply placed the functionality into PHP class format for easy use and updated the code to be a bit more transparent.  As far as PHP classes go, this one is quite small:

// Declare the class
class GooglePageRankChecker {
	
	// Track the instance
	private static $instance;
	
	// Constructor
	function getRank($page) {
		// Create the instance, if one isn't created yet
		if(!isset(self::$instance)) {
			self::$instance = new self();
		}
		// Return the result
		return self::$instance->check($page);
	}
	
	
	// Convert string to a number
	function stringToNumber($string,$check,$magic) {
		$int32 = 4294967296;  // 2^32
	    $length = strlen($string);
	    for ($i = 0; $i < $length; $i++) {
	        $check *= $magic; 	
	        //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31), 
	        //  the result of converting to integer is undefined
	        //  refer to http://www.php.net/manual/en/language.types.integer.php
	        if($check >= $int32) {
	            $check = ($check - $int32 * (int) ($check / $int32));
	            //if the check less than -2^31
	            $check = ($check < -($int32 / 2)) ? ($check + $int32) : $check;
	        }
	        $check += ord($string{$i}); 
	    }
	    return $check;
	}
	
	// Create a url hash
	function createHash($string) {
		$check1 = $this->stringToNumber($string, 0x1505, 0x21);
	    $check2 = $this->stringToNumber($string, 0, 0x1003F);
	
		$factor = 4;
		$halfFactor = $factor/2;

	    $check1 >>= $halfFactor;
	    $check1 = (($check1 >> $factor) & 0x3FFFFC0 ) | ($check1 & 0x3F);
	    $check1 = (($check1 >> $factor) & 0x3FFC00 ) | ($check1 & 0x3FF);
	    $check1 = (($check1 >> $factor) & 0x3C000 ) | ($check1 & 0x3FFF);	

	    $calc1 = (((($check1 & 0x3C0) << $factor) | ($check1 & 0x3C)) << $halfFactor ) | ($check2 & 0xF0F );
	    $calc2 = (((($check1 & 0xFFFFC000) << $factor) | ($check1 & 0x3C00)) << 0xA) | ($check2 & 0xF0F0000 );

	    return ($calc1 | $calc2);
	}
	
	// Create checksum for hash
	function checkHash($hashNumber)
	{
	    $check = 0;
		$flag = 0;

		$hashString = sprintf('%u', $hashNumber) ;
		$length = strlen($hashString);

		for ($i = $length - 1;  $i >= 0;  $i --) {
			$r = $hashString{$i};
			if(1 === ($flag % 2)) {			  
				$r += $r;	 
				$r = (int)($r / 10) + ($r % 10);
			}
			$check += $r;
			$flag ++;	
		}

		$check %= 10;
		if(0 !== $check) {
			$check = 10 - $check;
			if(1 === ($flag % 2) ) {
				if(1 === ($check % 2)) {
					$check += 9;
				}
				$check >>= 1;
			}
		}

		return '7'.$check.$hashString;
	}
	
	function check($page) {

		// Open a socket to the toolbarqueries address, used by Google Toolbar
		$socket = fsockopen("toolbarqueries.google.com", 80, $errno, $errstr, 30);

		// If a connection can be established
		if($socket) {
			// Prep socket headers
			$out = "GET /search?client=navclient-auto&ch=".$this->checkHash($this->createHash($page))."&features=Rank&q=info:".$page."&num=100&filter=0 HTTP/1.1\r\n";
			$out .= "Host: toolbarqueries.google.com\r\n";
			$out .= "User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 2.0.114-big; Windows XP 5.1)\r\n";
			$out .= "Connection: Close\r\n\r\n";

			// Write settings to the socket
			fwrite($socket, $out);

			// When a response is received...
			$result = "";
			while(!feof($socket)) {
				$data = fgets($socket, 128);
				$pos = strpos($data, "Rank_");
				if($pos !== false){
					$pagerank = substr($data, $pos + 9);
					$result += $pagerank;
				}
			}
			// Close the connection
			fclose($socket);
			
			// Return the rank!
			return $result;
		}
	}
}

The createHash and checkHash methods perform the deep down mathematical operations.  Once those are out of the way, the check method connects to Google's toolbar server, disguising itself as a toolbar via the User-Agent header, to get the page's PageRank.  A singleton pattern is used since creating individual instances isn't important:

$rank = GooglePageRankChecker::getRank("davidwalsh.name"); // returns "5"

The number provided back represents the PageRank for the URL provided!  This PHP class can be used on its own, but I've created a MooTools-powered script to retrieve an address' PageRank via some simple AJAX.

The MooTools JavaScript

This MooTools quick inline MooTools script responds to a button click, making an AJAX call to a PHP script that runs the class provided above:

// When the DOM is ready
window.addEvent("domready",function() {
	
	// When the form is submitted...
	var form = document.id("rankForm"), request, display, domain;
	form.addEvent("submit",function(e) {
		// Stop the event
		if(e) e.stop();
		
		// Create request, if not already created
		if(!request) {
			domain = document.id("domain");
			display = document.id("rankerDisplay");
			request = new Request({
				url: "pagerank-checker.php", 
				method: "post",
				onComplete: function(response) {
					display.setStyle("display","block").set("text","Page rank for " + domainValue + " is: " + response);
				}
			});
		}
		
		// Get the value fo the URL
		domainValue = domain.get("value");
		
		// Send the request
		request.send({ data: { domain: domainValue } });
	});
	
});

Using a JavaScript snippet like this, you could easily add a JavaScript-fronted Google PageRank checker with the framework of your choosing.

Outstanding work by Jamie Scott in creating the base functions to retrieve a page's Google PageRank with PHP.  Hopefully my class makes the PageRank code a bit more portable and recognizable.

ydkjs-3.png

Recent Features

  • 5 Awesome New Mozilla Technologies You&#8217;ve Never Heard&nbsp;Of

    My trip to Mozilla Summit 2013 was incredible.  I've spent so much time focusing on my project that I had lost sight of all of the great work Mozillians were putting out.  MozSummit provided the perfect reminder of how brilliant my colleagues are and how much...

  • 9 Mind-Blowing WebGL&nbsp;Demos

    As much as developers now loathe Flash, we're still playing a bit of catch up to natively duplicate the animation capabilities that Adobe's old technology provided us.  Of course we have canvas, an awesome technology, one which I highlighted 9 mind-blowing demos.  Another technology available...

Incredible Demos

  • Adding Events to Adding Events in&nbsp;MooTools

    Note: This post has been updated. One of my huge web peeves is when an element has click events attached to it but the element doesn't sport the "pointer" cursor. I mean how the hell is the user supposed to know they can/should click on...

  • MooTools Zoomer&nbsp;Plugin

    I love to look around the MooTools Forge. As someone that creates lots of plugins, I get a lot of joy out of seeing what other developers are creating and possibly even how I could improve them. One great plugin I've found is...

Discussion

  1. This would be perfect for a CMS with SEO tools. You should make a WordPress plugin out of it!

    • I’m not sure a wordpress plugin would be good for this. Maybe modify a current SEO plugin to show your current pagerank, which would be a very minor addition, and I’m sure already exists.

  2. peter

    A download would be really helpful for those us us or are still learning….

    tried using the class and it didn’t work so gone away frustrated and realising that I just wasted my time on your site, and that not good for me or you

    • You’ve not described the error in any way. How can anyone help you? Did you not put the php code in tags?

      I didn’t offer a download because the source is all there…as it is with every post.

    • peter

      Yes the copy button did not work for me, but I just selected copied and pasted the code.

      So the code is there but them you have to add the PHP and script tags, the link to the mootools library, the form to submit the URL… we are talking about recreating the demo, or there is not point in having a demo if it cant be recreated..

      and I can’t recreate it. Sorry if I am not a good enough coder, that’s why I’m reading your tutorials…

    • If you used the copy method from the code all the HTML gets encoded. So you need to manually copy or re-encode the HTML.

      ie. < should be <

    • Using Firefox 4.0, the “copy” button does not even work.

  3. This is awesome, putting it to use.

  4. Ondřej Švec

    Pretty nice class! I am just curious: Why do you initialize the instance at all? Wouldn’t be better to just set all functions to static and use it the way without initialization of the class?

  5. joe

    This is pretty cool (as is the rest of your site) and I appreciate you sharing the code but I’m not quite sure I see the point. There are plenty of tools already out there for checking “toolbar” PageRank (not to be confused with *actual* PageRank). I use a firefox extension called Quirk SearchStatus that automatically shows the PR (and other metrics) of whichever page I’m on. seoforfirefox is another great plugin I use that can do a lookup of an individual page as well as list metrics of sites in SERPs.

    As for not being able to argue the importance of PR in driving traffic or having any real bearing on SEO whatsoever, I’d have to disagree with you. PR hasn’t been a meaningful factor for quite some time now and it’s not uncommon to see PR 0 sites whipping the pants off sites with much higher PR scores in the rankings.

  6. Victor Bolshov

    just a few notes on PHP code.

    1. $c = __CLASS__;
    self::$instance = new $c;

    you could write “self::$instance = new self();” instead.

    2. stringToNumber() is quite useless on a 64-bit machine

    3. function check($domain) could make use of PHP’s file_get_contents() and stream context (see http://php.net/file_get_contents, Example #4 Using stream contexts). Cheers!

    • Thank you for the improvement ideas. Could you explain what you mean by the second item, for the sake of others?

      Thank you!

    • Victor Bolshov

      In fact, it seems the code might not work properly even on 32bit system. 2147483647 is the max int in 32bit PHP (we don’t have unsigned ints here ;) ). On a 64bit system the limit is high enough not to take care in most cases.

  7. Very cool. I’m in shocked disbelief with the result it spits back out after entering my domain. The ego took a beating. Guess I gotta build a bridge…

  8. Hi David, you mention “this post will focus on how you can retrieve a domain’s Google PageRank” but please don’t forget that PageRank is assigned to a page, not a domain. Other than that, great post, thanks!

    ~ @kovshenin

  9. Awsome Article. I will use this in my code library. Thanks Mr.Walsh :)

  10. Wow awsome Tuts will use for future projects.. :)

  11. Thanks for share. I use code from Luc De Brouwer ( http://www.lucdebrouwer.nl/using-php-to-retrieve-the-google-pagerank-of-any-domain/ ) adapted by me to CodeIgniter Library…

    It’s effective solution but sometimes it’s not working – Google respond with error page. So if you want to use this code in production, remember about occasional limitations.

  12. I’m getting all the websites with PR5… Is it a bug or something?

  13. SirChunk

    Hey David,
    On your demo, how do you get around it if too many requests are made as surely G would ban the server IP?

  14. this code work for me
    on my local host with PHP 5.2

    thanks a lot david

  15. what happen if host google toolbar does not responding?

  16. Maciej Mikulski

    Few days ago Google changed query phrase from “search?” to “tbr?”
    so the script stopped working.
    To bring it back you should change mentioned phrase from:
    $out = "GET /search?client=navclient-auto&ch=".$this->checkHash
    to:
    $out = "GET /tbr?client=navclient-auto&ch=".$this->checkHash

    @David: thanks for the script.

  17. Jürgen

    Thx to Maciej Mikulski for the information. Saved me a lot of time.

  18. Hi Guys,

    You’ll find an update version of the code here: https://github.com/phurix/pagerank

    Hope this helps!

  19. Hello.. I have tried with the code above. But i could not get the pagerank. Can you please help to get the pagerank. I need it . I have seen in your demo also but could not get it. Please help me urgent.

    Thanks,
    Thiru :)

  20. Tried this script but not working, could you pl. update the script and share code base.

  21. Maciej, are are a champion! I tried 5 different pagerank php classes and none of them worked, your find regarding the Google query phrase fixed them.

  22. Marco

    Let me just give you a link to the original script:

    http://labs.phurix.net/articles/pagerank

    I used this script in 2007, thats why i recognized. What exactly did you do to it to make it yours?

  23. Is this work in android ?

  24. if you find an error like “Strict Standards: Non-static method GooglePageRankChecker::getRank() should not be called statically in D:\xampp\htdocs\test.php on line 3″,

    replace this:
    function getRank($page)
    with:
    public static function getRank($page)

  25. bobo

    Thanks wps for your tips about the error “Strict Standards: Non-static method GooglePageRankChecker::getRank() should not be called statically in D:\xampp\htdocs\test.php on line 3″,

    now this dummy have working it to,

    But, this script shows only a number, our site has a 2, not the best i think? ;-)

  26. Hmmm.. I think this is not working anymore. Weird…

  27. It isnt working any more. :(
    demo also dead.

  28. Yeah, seems the script is no longer functional. Google must have changed something..

  29. Assam Silk

    I tried but its not working , i thinks its no longer functioning. Anyways thanks for the post

  30. Nice script example but Google have changed their structure so this script no longer functions. Thanks David but it would be worth putting a note at the top of this post for other users.

  31. Ive been looking for a script like this but sadly it is no longer working :( I hope you can update the script for it to work again. Thank you

  32. Made some changes in code above and now it works perfectly (especially for batch pagerank discoveries).

    $socket = fsockopen("toolbarqueries.google.com", 80, $errno, $errstr, 30);
    

    now looks like

    $socket = fsockopen("toolbarqueries.google.com", 80);
    

    and

    while(!feof($socket)) {
    $data = fgets($socket, 128);
    

    changed to

    while((!feof($socket)) AND (socket_get_status($socket)['timed_out'] != 1)) {
    $data = @fgets($socket);
    

    PHP 5.4.4 rules

  33. thankskkl

    Thanks for the update KKL. Gonna test out soon. Will report back if I experience any problems

  34. Hi KKL,

    that didn’t seem to work for me either, after a trawl I ended up at Github – https://github.com/eyecatchup/SEOstats/#brief-example-of-use

    This script does the trick – their demo isn’t working so example here – http://www.icalculator.info/website/google_pagerank_calculator.html

    This works perfectly for me, I have wrapped at the backend of a couple of sites to monitor specific landing pages etc, it wraps nicely and is reasonably fast when applied in an array.

    Its a working Solution until David amazes us all again :)

  35. Hi, this code worked fine for me until my ISP upgraded to PHP 5.4.4.

    Is there any fix to get it working again?

    It still works on PHP 5.3.3

  36. webnull

    What’s the code license? Can I add it to a LGPLv3 licensed project including @author tags and links to this post?

  37. As for SEOstats: As answered here http://stackoverflow.com/a/23509705/624466, SEOstats internally uses a standalone Google PageRank class that can be used as follows:

      $url = 'http://somedomain.com/';  
      $pr  = new GTB_PageRank($url);
    
      $rank = $pr->getPageRank();
    
      printf("The Google Pagerank of %s is %s.", $url, $rank);
    

    The nice thing about this class, as I think, is that it supports all existing PageRank hashing algorithms (awesome, jenkins, jenkins2 and IE) and has some advanced features built in, such as suggested Toolbar-TLD and more.

    You can get the code here: https://github.com/eyecatchup/SEOstats/blob/master/SEOstats/Services/3rdparty/GTB_PageRank.php

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!