Download a URL’s Content Using PHP cURL
Downloading content at a specific URL is common practice on the internet, especially due to increased usage of web services and APIs offered by Amazon, Alexa, Digg, etc. PHP's cURL library, which often comes with default shared hosting configurations, allows web developers to complete this task.
The Code
/* gets the data from a URL */ function get_data($url) { $ch = curl_init(); $timeout = 5; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); $data = curl_exec($ch); curl_close($ch); return $data; }
The Usage
$returned_content = get_data('https://davidwalsh.name');
Alternatively, you can use the file_get_contents function remotely, but many hosts don't allow this.
Alternatively you can use the PHP DOM:
Keep in mind this is not a tested piece of code, I took parts from a working script I have created and cut out several of the checks I’ve put in to remove whitespace, duplicates, and more.
hi
I am using external url to valid or invalid page. how to get success or failure without using curl
For your script we can also add a User Agent:
Hi Shawn,
using user agent as option helped me to sort out my problem.
Thanks
Excellent additions Shawn — thank you for posting them!
And with this great power, comes great responsibility =)
Very true Chris. It’s up to the developer to use it for good or evil. I suppose I’ve used it for both in the past.
For downloading remote XML or text files, this script has been golden.
Great script! Does anyone know how to use that script to save the content it gathered and save it to a file locally on the server?
@KP: Check out my other article, Basic PHP File Handling — Create, Open, Read, Write, Append, Close, and Delete, here:
http://davidwalsh.name/basic-php-file-handling-create-open-read-write-append-close-delete
I am trying to use this function “get_data($url)”, but it gives blank page when I echoed it. Anybody can please help me?
@Usman: There are a few reason why you may get a blank page. You may not have CURL installed on the server. The other possibility is that you need to “echo” the content before you close the connection — someone brought this issue to me the other day.
Hello David,
I am still unable to get result of it, I have checked(using phpinfo()) that CURL is installed. But its giving blank page. When I tried it from php command line its working.
Hi, this script works for me but unfortunately fails on urls from same domain as calling script. i cant see any error in error.log
Works like a charm!
Works just like…. file_get_contents! Thanks.
The code is very effective. but the problem is it returns all the html tags like and others. so is there anyway to get rid of it?
Use
strip_tags($textRetrieved);
This will return the string with no tags. I hope this helps.this code is way too short, even php.net probably has a longer version! beware if you use this to enable other users to make the URL requests, they can easily use it to upload malicious code/whole new pages/huge files, like mp3s or movies, that will eat up all your bandwidth.
Do you know of a way to have it click a link on a page. I’m trying to work with another companies registration form. Its a stupid asp page. On the first page it puts ?DoctorId=13074 at the end of the url. On the next page with the registration form it dynamically makes a random string in a hidden input box that gets posted with the form. So is there any way I can have it click and link once it loads a page?
hi,
I’m using curl to get details from an api call, in one of my api call it returns a zip file,
i like to force download that zip file , how can i do this with curl
David Walsh code does not give anything to me.
Why?
I did include php tags before and after both codes.
@Joel – cause you have to add :
echo $returned_content
after the last line($returned_content = get_data('http://davidwalsh.name');)
i want to onclick a link after getting contents of webpage
how to do it?
I would like to remove the xml declaration from the returned url.
I am appending the gathered data to an existing php/xml file and do not want it.
is there a simple solution??
hi!
im trying to parse AJAX with PHP, problem is:
when i read the URL SOURCE, the AJAX part isn’t visible, and i only grab HTML from the rest of site.
how to solve this problem? any ideas?
Is there a way to use curl in php like you can in the command line. aka
curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”
This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.
If anyone nows the answer to this I would greatly Appreciate it.
Is there a way to use curl in php like you can in the command line. aka
curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”
This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.
If anyone nows the answer to this I would greatly Appreciate it.
Is there a way to use curl in php like you can in the command line. aka
curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”
This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.
If anyone nows the answer to this I would greatly Appreciate it.
@Kelly: yes,
place something like this in your php: 3 options,
you have options using curl:
return the data in with database driven string. returns the data and appends it to your php, html,xml etc. VERY HANDY – esp. for flash and others, see: worldwideweather.com – forum,
this trick allows flash too read an external xml file for its language and database info. using php to call the user-specific info you can write the flash xml on the fly – this script returns the users language interface for flash, php calls the xml – the user is spanish (language) ES and appending to the php xml call, the the file is read and writes this into the php script itself with , very very fast
return the data in external xml file from php user specific database call ” string – gets data specific for user and generates file on the fly , xml, php, html whatever..:
HOPE THIS HELPS
Can we use this function to parse all content in a url?
@Indonesia: Except there are a lot more options. I (believe) that it’s possible to get the whole HTTP response using CURL, and (believe) that that is not true with ‘file_get_contents())
Great,
It is usefull to get xml or images from other site. if server is not able to get content from fopen.
Thanks
Great,
It is useful to get xml or images from other site. if server is not able to get content from fopen.
Thanks
Nice,
PHP provide other two method to fetch an URL – Curl and Fsockopen.
To use they you can check this example : http://www.bin-co.com/php/scripts/load/
how to write return into new file?
Hi,
Can you put the curl call in a loop, i have a list of about 1000 urls that i want to ‘hit’ so the caches build up, can i just chuck the above code into a loop or will that be too resource heavy?
Thanks
Pete
Thanks for the code..Great!
It is very possible to put this into a automatic crawler for user inputted sites or even make a automatic crawl out of this… The code is short but it works for only one page at a time.. To make it look at multiple pages you have to do some minor PHP coding but nothing major…
I am working on a script right now that works using the code above and just keeps crawling based on the links that on on the initial web page Crawled. A non stop Spider script! They are already out there but I like to say I can make one too…
The script will also take the Meta tags ( Description and Keywords and place them into a database too. Thus giving me a search engine and not a user submitted directory…
if you would like to join the team simply e-mail me at justin2009@gmail.com
I want to extract the images present in the URL and first paragraph from the url. How can I do that?
Hi there I can’t post code here so I can provide you mine class which extract the purticular tag from the return page it could be any html tag.
Regards,
M. Singh
A simple question..how to accelerate the downloading process using cURL it is damn slow…takes sometimes 45sec to download 4kb page
A simple question..how to accelerate the downloading process using cURL it is damn slow…takes sometimes 45sec to download 4kb page
It depends upon the configuration of the host server.
How to work with https. I have site which not loading when I try to open https it simple return 405 error. any help please mail me at msingh@ekomkaar.com
http//www.zerospeedsensor.com/
ie it doesnt return any output.
Very strange. This function returns only relative small pages.
It works if the source code has under 200 lines.
If the web page is bigger won’t return anything. Not even errors.
Same thing happens with file_get_contents.
PHP Version 5.2.11
memory_limit – 128M
max_execution_time – 30
error_reporting(E_ALL)
Any idea?
Hey David,
I did searched on net to find rough code by which i can get “Reciprocal” back links status.
This helps me finally. :)
I do modify it as per my need.
To check backlinks
0)
{
echo ‘found’;
}else{
echo ‘Not found’;
}
}
$remote_url = ‘http://www.listsdir.com/’;
$mmb_url = ‘http://mymoviesbuzz.com/titles/’;
$returned_content = get_data($remote_url,$mmb_url);
?>
Thanks.
Thanks, this script help me to move my wordpress content to new host.
I think good practice to use CURLOPT_USERAGENT in cURL scripts…
In order to read sites encrypted by SSL, like Google Calendar feeds, you must set these CURL options:
Hello David,
How can I download a file from remote url? I’ve try using your method but no luck :(
how can login by curl
How repeat the process
How can I download the contents of a website that requires login??
You need to set session for that and pass them with header so they can use as normal login process. For further details you can contact me at msingh@ekomkaar.com
Hi,
I’m trying to download contents of a website that requires login, but my script is not working.
Could you help ?
Thanks.
I’m running Web hosting Website. There My Domain Provider gave me some HTTP API’s. I tried to implement them but i’m getting empty response from curl. Its a HTTPS url and i used
params in my curl. But still getting empty response. Can anyone help me in this! I’m new to cURL :'(
It is possible to retrieve the code inserted into html tag (i.e. flash)?
More precisely, @ the linkedin page of a skill:
http://www.linkedin.com/skills/skill/Java?trk=skills-pg-search,
there is a graphic obtained by an tag, which returns an image.
When I take source code of the page or when I use file_get_contents() php function, I can obtain only the returned tag.
I can see on the Firefox analysis of the page all these information, but I want an automatic script.
Any solution?
Thank you, the code is working fine for me. You are saving me ;)
thank you! i looked for this code quite some time.
Thank you!
I wasn’t fully getting it, and just want a script I could copy and paste, make sure things work working, then modify from there. This was the only one I could find that actually return the content. Thank you!
I’ve been here on a few occasions and appreciate every aspect of the user friendly design! And every article is quality. Thanks!
(Wow, being positive in a somewhat general way like that kind of resembles the ever infamous spam comments. Sorry :-/ )
Thanks a lot for the script!
I am trying to run curl on localhost, I have changed php.ini. No errors a blank page only coming..is there any other settings in php.ini or apache settings?
When I use the PHP curl function, it always wants to first return (as in echo) the contents of the URL when I only want it assigned to a variable. Is there a way to stop this from happening? I used your code exactly and simply called it from the main program. The behavior is the same if I call the php program from the command line or from via a browser.
Please ignore my previous post. For some unknown reason, I was overlooking a simple echo statement in the midst of my sloppy code. Duh…
It works just fine!
Thanks . It works. i have one question. is it possible to filter the result. i mean, i want to publish some contents and some not.
Thank You
Only thing that may be missing is potential redirects, potential sessions, and maybe a few other thing (browser as mentioned),. E.g. if you will download a file you will often be redirected or you will need to use sessions. The solution to this will something like this:
Thanks . It works.
I have a question, sorry but the code above don’t work for me because i’m not familiar with PHP CURL.
I have a form and an image within the form, but basically its a certificate. The user has two choices, either print or download. how can i download into am image/jpeg content-type..
Nice. This is the preferred way to get HTML.
file_get_contents for URLs is getting close to being a train wreck. It can be turned off or on unpredictably by hosts, and it seems incompatible with many modern linux distributions out of the box. file_get_contents seems to use its own rules for name resolution and often times out or is extremely slow. THere seems to be no consistent fix for this.
Don’t use file_get_contents. Use cURL. Combined with the Simple DOM Parser, it is powerful stuff.
thanks for the code,it works well when i try to store the contents of a page from the intranet or local server but it is not working when i m trying to load a page from the internet say http://www.google.com or any other sites. So, if there is any solution to this problem please mail me.
This curl code is extracting page as whole. Am i able to extract some part from inside page .. Ex: i want to extract a portion in between ? what else code i will use to extract?
I implement
Doesn’t work, what wrong with my code?
@Ano there is problem in your domain : http://data.alexa.com/data?cli=10&dat=snbamz&url=pasarkode.com
If you go and see in your source code you can see the function is working fine but you are actually getting xml code so you must call those codes into xml instead of html.
Thanks a lot @Vinay Pandya !
I was trying to figure out why I could not download files from HTTPs urls.
I was like crazy as nuts because “curl_setopt($ch, CURLOPT_SSLVERSION,3);” didn’t work but your code is good.
Thank you!
I am trying to add a piece of code which gets a url and displays content on that page in an article form the web using this block of code. I am getting nothing, it will not do anything. I have activated the php plugin. I am on a Joomla 3.4.6 version. The page I am trying to show the code on is here http://www.chefjamie.com/2015/index.php/features-2/layouts
Hi all,
I run the code and get a blank page. When I add an
echo $returned_content
, I don’t get the source code but the page itself.If I execute
curl -s 'http://download.finance.yahoo.com'
on command line I get the source code.Can anyone help me?
Thanks
Alex
Can you please help to find a solution for my problem with Curl .
I wrote a script that allows me to use CURL to have information on streaming links. I managed to write the script for the streaming links that are hosted in the streaming website, and I was able to get the information from the servers, Also I use the command WGET when I want to download the link.
My problem is: how to use CURL or WGET to get a response that the link exists ( the link work with VLC or in KODI ) and it is valid in the server like this link: ( i got the links from KODI )
I mean that i want to use CURL or WGET with kodi links to get information from the server
The purpose of the request is how to prove that the link exists. With the curl command, I have a forbidden return 403 while the link is functional via kodi. Here is my script and example of a link for example :
URL –>http://dittotv.live-s.cdn.bitgravity.com/cdn-live/_definst_/dittotv/secure/zee_cinema_hd_Web.smil/playlist.m3u8
Also i tried : wget -a –spider myurl –> i receive a 8 code returned.
Thank you for yout time Sir
The Script that i use :
Is it possible to download the large file to server for example 500MB or 1 GB file through this process.
Is there any way to do it by jquery & ajax to make it more userfriendly.
Thanks for your answer in advance.