Yahoo SEO Domain Result Grabber

By  on  

I released my PHP Google Grabber script about a month ago and it was a big hit, even spawning Python and Groovy versions. Obtaining the number of pages indexed in Google by simply providing a domain name (or multiple, if you loop the function) can save you a lot of time. I run this script on a monthly basis to keep track of my customers' websites -- many of them use CMS' we've built so I get to take a peak at how they're doing SEO-wise.

Although Yahoo! isn't nearly as relevant as Google in the search department, Yahoo! is still the most visited website on the internet. Since I already had the basic framework of the code built (from my Google Grabber), I thought it might be beneficial to take a few moments to Yahoo!ize it.

The Code

/* return result number */
function get_yahoo_results($domain = 'davidwalsh.name')
{
	// get the result content
	$content = file_get_contents('https://siteexplorer.search.yahoo.com/search?p=http%3A%2F%2F'.$domain.'&bwm=p&bwms=p&fr2=seo-rd-se');

	// parse to get results
	$pages = str_replace(array(' ',')','('),'',get_match('/Pages (.*) /isU',$content));
	$inlinks = str_replace(array(' ',')','('),'',get_match('/Inlinks (.*) /isU',$content));

	$return['pages'] = $pages ? $pages : 0;
	$return['inlinks'] = $inlinks? $inlinks : 0;

	// return result
	return $return;
}

/* helper: does the regex */
function get_match($regex,$content)
{
	preg_match($regex,$content,$matches);
	return $matches[1];
}

The Usage

domains = array('davidwalsh.name','digg.com','yahoo.com','cnn.com','dzone.com','some-domain-that-doesnt-exist.com');
foreach($domains as $domain)
{
	$result = get_yahoo_results($domain);
	echo $domain,': ',$result['pages'],' pages, ',$result['inlinks'],' inlinks';
}

//davidwalsh.name: 204 pages, 518 inlinks
//digg.com: 20,700,000 pages, 14,300,000 inlinks
//yahoo.com: 1,290,000,000 pages, 4,650,000 inlinks
//cnn.com: 7,510,000 pages, 1,090,000 inlinks
//dzone.com: 776,000 pages, 15,000 inlinks
//some-domain-that-doesnt-exist.com: 0 pages, 0 inlinks

Much like my Google Grabber, you may need to adjust the method of connecting to Yahoo! based on your hosting environment. cURL may be the best option for you.

Recent Features

  • By
    An Interview with Eric Meyer

    Your early CSS books were instrumental in pushing my love for front end technologies. What was it about CSS that you fell in love with and drove you to write about it? At first blush, it was the simplicity of it as compared to the table-and-spacer...

  • By
    Animated 3D Flipping Menu with CSS

    CSS animations aren't just for basic fades or sliding elements anymore -- CSS animations are capable of much more.  I've showed you how you can create an exploding logo (applied with JavaScript, but all animation is CSS), an animated Photo Stack, a sweet...

Incredible Demos

  • By
    Flashy FAQs Using MooTools Sliders

    I often qualify a great website by one that pay attention to detail and makes all of the "little things" seem as though much time was spent on them. Let's face it -- FAQs are as boring as they come. That is, until you...

  • By
    Using jQuery and MooTools Together

    There's yet another reason to master more than one JavaScript library: you can use some of them together! Since MooTools is prototype-based and jQuery is not, jQuery and MooTools may be used together on the same page. The XHTML and JavaScript jQuery is namespaced so the...

Discussion

  1. kenny

    Hi, David,

    how do I use this please? do you have a example or something like that?

    Thanks for sharing.

  2. this code working fine, thanks

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!