Download a URL’s Content Using PHP cURL
Written by David Walsh on Tuesday, December 11, 2007
Downloading content at a specific URL is common practice on the internet, especially due to increased usage of web services and APIs offered by Amazon, Alexa, Digg, etc. PHP’s cURL library, which often comes with default shared hosting configurations, allows web developers to complete this task.
The Code
/* gets the data from a URL */
function get_data($url)
{
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}The Usage
$returned_content = get_data('http://davidwalsh.name');Alternatively, you can use the file_get_contents function remotely, but many hosts don’t allow this.
Epic Discussion
Be Heard!
I want to hear what you have to say! Share your comments and questions below.
Alternatively you can use the PHP DOM:
$keywords = array();
$domain = array(‘http://davidwalsh.name’);
$doc = new DOMDocument;
$doc->preserveWhiteSpace = FALSE;
foreach ($domain as $key => $value) {
@$doc->loadHTMLFile($value);
$anchor_tags = $doc->getElementsByTagName(‘a’);
foreach ($anchor_tags as $tag) {
$keywords[] = strtolower($tag->nodeValue);
}
}
Keep in mind this is not a tested piece of code, I took parts from a working script I have created and cut out several of the checks I’ve put in to remove whitespace, duplicates, and more.
For your script we can also add a User Agent:
$userAgent = ‘Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)’;
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
Some other options I use:
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
Excellent additions Shawn — thank you for posting them!
And with this great power, comes great responsibility =)
Very true Chris. It’s up to the developer to use it for good or evil. I suppose I’ve used it for both in the past.
For downloading remote XML or text files, this script has been golden.
Great script! Does anyone know how to use that script to save the content it gathered and save it to a file locally on the server?
@KP: Check out my other article, Basic PHP File Handling — Create, Open, Read, Write, Append, Close, and Delete, here:
http://davidwalsh.name/basic-php-file-handling-create-open-read-write-append-close-delete
I am trying to use this function “get_data($url)”, but it gives blank page when I echoed it. Anybody can please help me?
@Usman: There are a few reason why you may get a blank page. You may not have CURL installed on the server. The other possibility is that you need to “echo” the content before you close the connection — someone brought this issue to me the other day.
Hello David,
I am still unable to get result of it, I have checked(using phpinfo()) that CURL is installed. But its giving blank page. When I tried it from php command line its working.
Works like a charm!
Works just like…. file_get_contents! Thanks.
The code is very effective. but the problem is it returns all the html tags like and others. so is there anyway to get rid of it?
this code is way too short, even php.net probably has a longer version! beware if you use this to enable other users to make the URL requests, they can easily use it to upload malicious code/whole new pages/huge files, like mp3s or movies, that will eat up all your bandwidth.
Do you know of a way to have it click a link on a page. I’m trying to work with another companies registration form. Its a stupid asp page. On the first page it puts ?DoctorId=13074 at the end of the url. On the next page with the registration form it dynamically makes a random string in a hidden input box that gets posted with the form. So is there any way I can have it click and link once it loads a page?
hi,
I’m using curl to get details from an api call, in one of my api call it returns a zip file,
i like to force download that zip file , how can i do this with curl
David Walsh code does not give anything to me.
Why?
I did include php tags before and after both codes.
@Joel – cause you have to add :
echo $returned_content
after the last line ($returned_content = get_data(‘http://davidwalsh.name’);)
i want to onclick a link after getting contents of webpage
how to do it?
I would like to remove the xml declaration from the returned url.
I am appending the gathered data to an existing php/xml file and do not want it.
is there a simple solution??
hi!
im trying to parse AJAX with PHP, problem is:
when i read the URL SOURCE, the AJAX part isn’t visible, and i only grab HTML from the rest of site.
how to solve this problem? any ideas?
Is there a way to use curl in php like you can in the command line. aka
curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”
This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.
If anyone nows the answer to this I would greatly Appreciate it.
Is there a way to use curl in php like you can in the command line. aka
curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”
This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.
If anyone nows the answer to this I would greatly Appreciate it.
Is there a way to use curl in php like you can in the command line. aka
curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”
This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.
If anyone nows the answer to this I would greatly Appreciate it.
@Kelly: yes,
place something like this in your php: 3 options,
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,
“http://www.whateveryouwant.com.php.html.xml”);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml_language = curl_exec($ch);
curl_close($ch);
echo “$xml.php.html_whatever”;
}
you have options using curl:
return the data in with database driven string. returns the data and appends it to your php, html,xml etc. VERY HANDY – esp. for flash and others, see: worldwideweather.com – forum,
this trick allows flash too read an external xml file for its language and database info. using php to call the userspecif info you can write the flash xml on the fly – this script returns the users languge interface for flash, php calls the xml – the user is spanish (language) ES and appending to the php xml call, the the file is read and writes this into the php script itself with , very very fast
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,
“http://www.verdegia.com/Files/System/TEST/Language/M_TEXT_” . $line{“Language”} . “.xml”);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml_language = curl_exec($ch);
curl_close($ch);
echo “$xml_language”;
}
return the data in external xml file from php user specific database call ” string – gets data specif for user and generates file on the fly , xml, php, html whatever..:
$sql=”SELECT * FROM $tbl_name WHERE username=’$myusername’”;
$results = mysql_query($sql);
while($line=mysql_fetch_assoc($results)) {
$file = “http://www.worldweatheronline.com/feed/weather.ashx?q=” . $line{“Postcode”} . “&format=xml&num_of_days=5&key=6c7e92e827155910100801″;
}
$ch = curl_init($file);
$fp = @fopen(“../Files/System/TEST/temp.xml”, “w”);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
$file = “../Files/System/TEST/temp.xml”;
$fp = fopen($file, “r”);
?>
HOPE THIS HELPS