Yahoo SEO Domain Result Grabber

By  on  

I released my PHP Google Grabber script about a month ago and it was a big hit, even spawning Python and Groovy versions. Obtaining the number of pages indexed in Google by simply providing a domain name (or multiple, if you loop the function) can save you a lot of time. I run this script on a monthly basis to keep track of my customers' websites -- many of them use CMS' we've built so I get to take a peak at how they're doing SEO-wise.

Although Yahoo! isn't nearly as relevant as Google in the search department, Yahoo! is still the most visited website on the internet. Since I already had the basic framework of the code built (from my Google Grabber), I thought it might be beneficial to take a few moments to Yahoo!ize it.

The Code

/* return result number */
function get_yahoo_results($domain = 'davidwalsh.name')
{
	// get the result content
	$content = file_get_contents('https://siteexplorer.search.yahoo.com/search?p=http%3A%2F%2F'.$domain.'&bwm=p&bwms=p&fr2=seo-rd-se');

	// parse to get results
	$pages = str_replace(array(' ',')','('),'',get_match('/Pages (.*) /isU',$content));
	$inlinks = str_replace(array(' ',')','('),'',get_match('/Inlinks (.*) /isU',$content));

	$return['pages'] = $pages ? $pages : 0;
	$return['inlinks'] = $inlinks? $inlinks : 0;

	// return result
	return $return;
}

/* helper: does the regex */
function get_match($regex,$content)
{
	preg_match($regex,$content,$matches);
	return $matches[1];
}

The Usage

domains = array('davidwalsh.name','digg.com','yahoo.com','cnn.com','dzone.com','some-domain-that-doesnt-exist.com');
foreach($domains as $domain)
{
	$result = get_yahoo_results($domain);
	echo $domain,': ',$result['pages'],' pages, ',$result['inlinks'],' inlinks';
}

//davidwalsh.name: 204 pages, 518 inlinks
//digg.com: 20,700,000 pages, 14,300,000 inlinks
//yahoo.com: 1,290,000,000 pages, 4,650,000 inlinks
//cnn.com: 7,510,000 pages, 1,090,000 inlinks
//dzone.com: 776,000 pages, 15,000 inlinks
//some-domain-that-doesnt-exist.com: 0 pages, 0 inlinks

Much like my Google Grabber, you may need to adjust the method of connecting to Yahoo! based on your hosting environment. cURL may be the best option for you.

Recent Features

  • By
    LightFace:  Facebook Lightbox for MooTools

    One of the web components I've always loved has been Facebook's modal dialog.  This "lightbox" isn't like others:  no dark overlay, no obnoxious animating to size, and it doesn't try to do "too much."  With Facebook's dialog in mind, I've created LightFace:  a Facebook lightbox...

  • By
    9 Mind-Blowing Canvas Demos

    The <canvas> element has been a revelation for the visual experts among our ranks.  Canvas provides the means for incredible and efficient animations with the added bonus of no Flash; these developers can flash their awesome JavaScript skills instead.  Here are nine unbelievable canvas demos that...

Incredible Demos

  • By
    Making the Firefox Logo from HTML

    When each new t-shirt means staving off laundry for yet another day, swag quickly becomes the most coveted perk at any tech company. Mozilla WebDev had pretty much everything going for it: brilliant people, interesting problems, awesome office. Everything except a t-shirt. That had to change. The basic...

  • By
    Create Snook-Style Navigation Using MooTools

    Jonathan Snook debuted a great tutorial last September detailing how you can use an image and a few jQuery techniques to create a slick mouseover effect. I revisited his article and ported its two most impressive effects to MooTools. The Images These are the same...

Discussion

  1. kenny

    Hi, David,

    how do I use this please? do you have a example or something like that?

    Thanks for sharing.

  2. this code working fine, thanks

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!