Building Resilient Systems on AWS: Learn how to design and implement a resilient, highly available, fault-tolerant infrastructure on AWS.

Download a URL’s Content Using PHP cURL

By David Walsh on December 11, 2007

Downloading content at a specific URL is common practice on the internet, especially due to increased usage of web services and APIs offered by Amazon, Alexa, Digg, etc. PHP's cURL library, which often comes with default shared hosting configurations, allows web developers to complete this task.

The Code

/* gets the data from a URL */
function get_data($url) {
	$ch = curl_init();
	$timeout = 5;
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

The Usage

$returned_content = get_data('https://davidwalsh.name');

Alternatively, you can use the file_get_contents function remotely, but many hosts don't allow this.

Recent Features

By David WalshFebruary 18, 2013
Create a Sheen Logo Effect with CSS
I was inspired when I first saw Addy Osmani's original ShineTime blog post. The hover sheen effect is simple but awesome. When I started my blog redesign, I really wanted to use a sheen effect with my logo. Using two HTML elements and...
By David WalshJuly 30, 2013
9 More Mind-Blowing WebGL Demos
With Firefox OS, asm.js, and the push for browser performance improvements, canvas and WebGL technologies are opening a world of possibilities. I featured 9 Mind-Blowing Canvas Demos and then took it up a level with 9 Mind-Blowing WebGL Demos, but I want to outdo...

Incredible Demos

By David WalshOctober 15, 2009
Form Element AJAX Spinner Attachment Using MooTools
Many times you'll see a form dynamically change available values based on the value of a form field. For example, a "State" field will change based on which Country a user selects. What annoys me about these forms is that they'll often do an...
By David WalshNovember 10, 2008
Create a Custom “:selected” Pseudo Selector in MooTools
A while back I read a very interesting article by MooTools core developer Jan Kassens about how to create a custom pseudo selector in MooTools. I was surprised at the ease in which one can add their own pseudo selector that I...

Discussion

Shawn

Alternatively you can use the PHP DOM:

$keywords = array();
$domain = array('http://davidwalsh.name');
$doc = new DOMDocument;
$doc->preserveWhiteSpace = FALSE;
foreach ($domain as $key => $value) {
    @$doc->loadHTMLFile($value);
    $anchor_tags = $doc->getElementsByTagName('a');
    foreach ($anchor_tags as $tag) {
        $keywords[] = strtolower($tag->nodeValue);
    }
}

Keep in mind this is not a tested piece of code, I took parts from a working script I have created and cut out several of the checks I’ve put in to remove whitespace, duplicates, and more.

karthik
hi

I am using external url to valid or invalid page. how to get success or failure without using curl

Shawn

For your script we can also add a User Agent:

$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);

Some other options I use:
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);

Ghulam Rasool
Hi Shawn,
using user agent as option helped me to sort out my problem.

Thanks

david
Excellent additions Shawn — thank you for posting them!
Chris Coyier
And with this great power, comes great responsibility =)
david
Very true Chris. It’s up to the developer to use it for good or evil. I suppose I’ve used it for both in the past.

For downloading remote XML or text files, this script has been golden.
KP
Great script! Does anyone know how to use that script to save the content it gathered and save it to a file locally on the server?
david
@KP: Check out my other article, Basic PHP File Handling — Create, Open, Read, Write, Append, Close, and Delete, here:

http://davidwalsh.name/basic-php-file-handling-create-open-read-write-append-close-delete
Usman
I am trying to use this function “get_data($url)”, but it gives blank page when I echoed it. Anybody can please help me?
david
@Usman: There are a few reason why you may get a blank page. You may not have CURL installed on the server. The other possibility is that you need to “echo” the content before you close the connection — someone brought this issue to me the other day.
Usman
Hello David,
I am still unable to get result of it, I have checked(using phpinfo()) that CURL is installed. But its giving blank page. When I tried it from php command line its working.

Jordah Ferguson
Hi, this script works for me but unfortunately fails on urls from same domain as calling script. i cant see any error in error.log

Dru
Works like a charm!
Indonesia
Works just like…. file_get_contents! Thanks.
Ajay
The code is very effective. but the problem is it returns all the html tags like and others. so is there anyway to get rid of it?

phpBeginner
Use strip_tags($textRetrieved); This will return the string with no tags. I hope this helps.

bit
this code is way too short, even php.net probably has a longer version! beware if you use this to enable other users to make the URL requests, they can easily use it to upload malicious code/whole new pages/huge files, like mp3s or movies, that will eat up all your bandwidth.
Kyle
Do you know of a way to have it click a link on a page. I’m trying to work with another companies registration form. Its a stupid asp page. On the first page it puts ?DoctorId=13074 at the end of the url. On the next page with the registration form it dynamically makes a random string in a hidden input box that gets posted with the form. So is there any way I can have it click and link once it loads a page?
Thomas Alexander
hi,
I’m using curl to get details from an api call, in one of my api call it returns a zip file,
i like to force download that zip file , how can i do this with curl
Joel Kiskola
David Walsh code does not give anything to me.
Why?

I did include php tags before and after both codes.
Rhys
@Joel – cause you have to add : echo $returned_content after the last line ($returned_content = get_data('http://davidwalsh.name');)
electrogeek
i want to onclick a link after getting contents of webpage
how to do it?
George
I would like to remove the xml declaration from the returned url.

I am appending the gathered data to an existing php/xml file and do not want it.

is there a simple solution??
FMdB
hi!

im trying to parse AJAX with PHP, problem is:

when i read the URL SOURCE, the AJAX part isn’t visible, and i only grab HTML from the rest of site.

how to solve this problem? any ideas?
Kelly
Is there a way to use curl in php like you can in the command line. aka

curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”

This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.

If anyone nows the answer to this I would greatly Appreciate it.
Kelly
Is there a way to use curl in php like you can in the command line. aka

curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”

This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.

If anyone nows the answer to this I would greatly Appreciate it.
Kelly
Is there a way to use curl in php like you can in the command line. aka

curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”

This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.

If anyone nows the answer to this I would greatly Appreciate it.

george

@Kelly: yes,

place something like this in your php: 3 options,

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,
        "http://www.whateveryouwant.com.php.html.xml");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml_language = curl_exec($ch);
curl_close($ch);
echo "$xml.php.html_whatever";

you have options using curl:
return the data in with database driven string. returns the data and appends it to your php, html,xml etc. VERY HANDY – esp. for flash and others, see: worldwideweather.com – forum,

this trick allows flash too read an external xml file for its language and database info. using php to call the user-specific info you can write the flash xml on the fly – this script returns the users language interface for flash, php calls the xml – the user is spanish (language) ES and appending to the php xml call, the the file is read and writes this into the php script itself with , very very fast

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,
        "http://www.verdegia.com/Files/System/TEST/Language/M_TEXT_" . $line{"Language"} . ".xml");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$xml_language = curl_exec($ch);
curl_close($ch);
echo "$xml_language";

return the data in external xml file from php user specific database call ” string – gets data specific for user and generates file on the fly , xml, php, html whatever..:

	
$sql="SELECT * FROM $tbl_name WHERE username='$myusername'";
				$results = mysql_query($sql);
		while($line=mysql_fetch_assoc($results)) {
				$file = "http://www.worldweatheronline.com/feed/weather.ashx?q=" . $line{"Postcode"} . "&format=xml&num_of_days=5&key=6c7e92e827155910100801";	
				}
				
$ch = curl_init($file);
$fp = @fopen("../Files/System/TEST/temp.xml", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
$file = "../Files/System/TEST/temp.xml";
$fp = fopen($file, "r");

HOPE THIS HELPS

Thiet ke web
Can we use this function to parse all content in a url?
Dennis Gearon
@Indonesia: Except there are a lot more options. I (believe) that it’s possible to get the whole HTTP response using CURL, and (believe) that that is not true with ‘file_get_contents())

hamadi

$sql =  "UPDATE staff SET 
		staffNo = $staff_no,
		f_name=$fname, 
		l_name=$lname,
		sex=$sex,
		DOB=$dob,
		position=$position, 
		salary=$salary,
		hiredate=$hiredate,
		contact_id=$contact_id,
		branchNo=$branch_no
		WHERE staffNo=$staff_no";
$query = mysql_query($sql) or die("Cannot query the database." . mysql_error());
echo "Database Updated.";

Sajid Hussain
Great,

It is usefull to get xml or images from other site. if server is not able to get content from fopen.

Thanks
Sajid Hussain
Great,

It is useful to get xml or images from other site. if server is not able to get content from fopen.

Thanks
saviola
Nice,
PHP provide other two method to fetch an URL – Curl and Fsockopen.
To use they you can check this example : http://www.bin-co.com/php/scripts/load/
buzzknow
how to write return into new file?
Pete
Hi,

Can you put the curl call in a loop, i have a list of about 1000 urls that i want to ‘hit’ so the caches build up, can i just chuck the above code into a loop or will that be too resource heavy?

Thanks

Pete
Popcorn
Thanks for the code..Great!
Myister
It is very possible to put this into a automatic crawler for user inputted sites or even make a automatic crawl out of this… The code is short but it works for only one page at a time.. To make it look at multiple pages you have to do some minor PHP coding but nothing major…

I am working on a script right now that works using the code above and just keeps crawling based on the links that on on the initial web page Crawled. A non stop Spider script! They are already out there but I like to say I can make one too…

The script will also take the Meta tags ( Description and Keywords and place them into a database too. Thus giving me a search engine and not a user submitted directory…

if you would like to join the team simply e-mail me at justin2009@gmail.com
Learnphp123
I want to extract the images present in the URL and first paragraph from the url. How can I do that?

ekomkaar.com
Hi there I can’t post code here so I can provide you mine class which extract the purticular tag from the return page it could be any html tag.

Regards,
M. Singh

Comrade
A simple question..how to accelerate the downloading process using cURL it is damn slow…takes sometimes 45sec to download 4kb page
Comrade
A simple question..how to accelerate the downloading process using cURL it is damn slow…takes sometimes 45sec to download 4kb page

Ivan Ivković
It depends upon the configuration of the host server.

ekomkaar.com
How to work with https. I have site which not loading when I try to open https it simple return 405 error. any help please mail me at msingh@ekomkaar.com
handbags cheap
http//www.zerospeedsensor.com/
Jordah Ferguson
ie it doesnt return any output.
Octav
Very strange. This function returns only relative small pages.
It works if the source code has under 200 lines.
If the web page is bigger won’t return anything. Not even errors.
Same thing happens with file_get_contents.

PHP Version 5.2.11
memory_limit – 128M
max_execution_time – 30
error_reporting(E_ALL)

Any idea?
iknowv
Hey David,

I did searched on net to find rough code by which i can get “Reciprocal” back links status.

This helps me finally. :)

I do modify it as per my need.

To check backlinks
0)
{
echo ‘found’;
}else{
echo ‘Not found’;
}
}

$remote_url = ‘http://www.listsdir.com/’;
$mmb_url = ‘http://mymoviesbuzz.com/titles/’;

$returned_content = get_data($remote_url,$mmb_url);
?>

Thanks.
andaru
Thanks, this script help me to move my wordpress content to new host.
harish
I think good practice to use CURLOPT_USERAGENT in cURL scripts…
Zac
In order to read sites encrypted by SSL, like Google Calendar feeds, you must set these CURL options:
```
curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
```
dinh vi
Hello David,
How can I download a file from remote url? I’ve try using your method but no luck :(
ahmad
how can login by curl
John
How repeat the process
Thunderbird
How can I download the contents of a website that requires login??

Manjit Singh
You need to set session for that and pass them with header so they can use as normal login process. For further details you can contact me at msingh@ekomkaar.com
Peter
Hi,
I’m trying to download contents of a website that requires login, but my script is not working.
Could you help ?
Thanks.

Vinoth Kumar
I’m running Web hosting Website. There My Domain Provider gave me some HTTP API’s. I tried to implement them but i’m getting empty response from curl. Its a HTTPS url and i used
```
curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
```
params in my curl. But still getting empty response. Can anyone help me in this! I’m new to cURL :'(
Giu87
It is possible to retrieve the code inserted into html tag (i.e. flash)?

More precisely, @ the linkedin page of a skill:

http://www.linkedin.com/skills/skill/Java?trk=skills-pg-search,

there is a graphic obtained by an tag, which returns an image.

When I take source code of the page or when I use file_get_contents() php function, I can obtain only the returned tag.

I can see on the Firefox analysis of the page all these information, but I want an automatic script.

Any solution?
Mika Andrianarijaona
Thank you, the code is working fine for me. You are saving me ;)
andrei
thank you! i looked for this code quite some time.
David
Thank you!

I wasn’t fully getting it, and just want a script I could copy and paste, make sure things work working, then modify from there. This was the only one I could find that actually return the content. Thank you!

I’ve been here on a few occasions and appreciate every aspect of the user friendly design! And every article is quality. Thanks!
(Wow, being positive in a somewhat general way like that kind of resembles the ever infamous spam comments. Sorry :-/ )
Profesor Yeow
Thanks a lot for the script!
wilson
I am trying to run curl on localhost, I have changed php.ini. No errors a blank page only coming..is there any other settings in php.ini or apache settings?
Tom
When I use the PHP curl function, it always wants to first return (as in echo) the contents of the URL when I only want it assigned to a variable. Is there a way to stop this from happening? I used your code exactly and simply called it from the main program. The behavior is the same if I call the php program from the command line or from via a browser.
Tom
Please ignore my previous post. For some unknown reason, I was overlooking a simple echo statement in the midst of my sloppy code. Duh…

It works just fine!
Mim
Thanks . It works. i have one question. is it possible to filter the result. i mean, i want to publish some contents and some not.
Thank You
dennis
Only thing that may be missing is potential redirects, potential sessions, and maybe a few other thing (browser as mentioned),. E.g. if you will download a file you will often be redirected or you will need to use sessions. The solution to this will something like this:
```
curl_setopt($s,CURLOPT_FOLLOWLOCATION,1);
curl_setopt($s,CURLOPT_COOKIEJAR, '/tmp/cookie.txt');
curl_setopt($s,CURLOPT_COOKIEFILE,'/tmp/cookie.txt');
```
truyen nguoi lon
Thanks . It works.

Vinay Pandya

$url = "https://api.dailymotion.com/video/xz9frh?fields=price_details";        
        $ch = curl_init();
	$timeout = 5;
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
$returned_content = $data;
echo $returned_content;
$error = substr($returned_content, 2, 5);
echo $error;

aljie
I have a question, sorry but the code above don’t work for me because i’m not familiar with PHP CURL.
I have a form and an image within the form, but basically its a certificate. The user has two choices, either print or download. how can i download into am image/jpeg content-type..
Breen
Nice. This is the preferred way to get HTML.

file_get_contents for URLs is getting close to being a train wreck. It can be turned off or on unpredictably by hosts, and it seems incompatible with many modern linux distributions out of the box. file_get_contents seems to use its own rules for name resolution and often times out or is extremely slow. THere seems to be no consistent fix for this.

Don’t use file_get_contents. Use cURL. Combined with the Simple DOM Parser, it is powerful stuff.
shafiul
thanks for the code,it works well when i try to store the contents of a page from the intranet or local server but it is not working when i m trying to load a page from the internet say http://www.google.com or any other sites. So, if there is any solution to this problem please mail me.
Imran
This curl code is extracting page as whole. Am i able to extract some part from inside page .. Ex: i want to extract a portion in between ? what else code i will use to extract?

Ano

I implement

$datanya = get_data('http://data.alexa.com/data?cli=10&dat=snbamz&url=pasarkode.com');

print_r($datanya);

function get_data($url) {
	$ch = curl_init();
	$timeout = 5;
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

Doesn’t work, what wrong with my code?

Rahul
@Ano there is problem in your domain : http://data.alexa.com/data?cli=10&dat=snbamz&url=pasarkode.com

If you go and see in your source code you can see the function is working fine but you are actually getting xml code so you must call those codes into xml instead of html.

Laurent
Thanks a lot @Vinay Pandya !
I was trying to figure out why I could not download files from HTTPs urls.
I was like crazy as nuts because “curl_setopt($ch, CURLOPT_SSLVERSION,3);” didn’t work but your code is good.
Kemal
Thank you!
John
I am trying to add a piece of code which gets a url and displays content on that page in an article form the web using this block of code. I am getting nothing, it will not do anything. I have activated the php plugin. I am on a Joomla 3.4.6 version. The page I am trying to show the code on is here http://www.chefjamie.com/2015/index.php/features-2/layouts
```
/* gets the data from a URL */
function get_data($url) {
	$ch = curl_init();
	$timeout = 5;
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

$returned_content = get_data('http://melissas.com');
```
Alex
Hi all,

I run the code and get a blank page. When I add an echo $returned_content, I don’t get the source code but the page itself.
If I execute curl -s 'http://download.finance.yahoo.com' on command line I get the source code.

Can anyone help me?
Thanks
Alex
achi
Can you please help to find a solution for my problem with Curl .
I wrote a script that allows me to use CURL to have information on streaming links. I managed to write the script for the streaming links that are hosted in the streaming website, and I was able to get the information from the servers, Also I use the command WGET when I want to download the link.
My problem is: how to use CURL or WGET to get a response that the link exists ( the link work with VLC or in KODI ) and it is valid in the server like this link: ( i got the links from KODI )
I mean that i want to use CURL or WGET with kodi links to get information from the server
The purpose of the request is how to prove that the link exists. With the curl command, I have a forbidden return 403 while the link is functional via kodi. Here is my script and example of a link for example :
URL –>http://dittotv.live-s.cdn.bitgravity.com/cdn-live/_definst_/dittotv/secure/zee_cinema_hd_Web.smil/playlist.m3u8
Also i tried : wget -a –spider myurl –> i receive a 8 code returned.

Thank you for yout time Sir

The Script that i use :
```
#!/bin/bash
declare ans2=Y;
while [ $ans2 = "Y" ];
do
read -p "URL to check: " url
if curl -v -i --output /dev/null --silent --fail "$url"; then
  printf  "$url --> The link exist !!:"
else
  printf "$url --> The link does not exist !!"
fi
printf 'Want you show the cURL information from the Streaming Link? (Y/N/Q):'
read -p " Your Answer :" ans
if [ $ans = "Q" ]; then 
exit 
fi
if [ $ans = "Y" ]; then curl -v -i "$url"
else printf 'OK ! No Prob ! -->  Next Question:' 
fi
printf 'Want You download the streaming video from the streaming server? (Y/N/Q):'
read -p "(Y/N/Q):" ans3
if [ $ans3 = "Q" ]; then 
exit 
fi
while [ $ans3 = "Y" ]
do
if curl --output /dev/null --silent --head --fail "$url"; then
wget "$url"
else 
printf "$red" 'The link is Down ! No file to download'
fi
exit
done
if [ $ans3 = "N" ]; then
printf 'OK ! No Prob ! -->  Next Question:'
fi
printf 'Want You check another URL ? (Y/N):'
read -p "(Y/N):" ans2
if [ $ans2 = "N" ] ; then 
printf "$red" "Good Bye - Thank you !!"
fi
done
 
```
Shafiq
Is it possible to download the large file to server for example 500MB or 1 GB file through this process.

Is there any way to do it by jquery & ajax to make it more userfriendly.

Thanks for your answer in advance.

Download a URL’s Content Using PHP cURL

The Code

The Usage

Recent Features

Create a Sheen Logo Effect with CSS

9 More Mind-Blowing WebGL Demos

Incredible Demos

Form Element AJAX Spinner Attachment Using MooTools

Create a Custom “:selected” Pseudo Selector in MooTools

Discussion