Download a URL’s Content Using PHP cURL

By  on  

Downloading content at a specific URL is common practice on the internet, especially due to increased usage of web services and APIs offered by Amazon, Alexa, Digg, etc. PHP's cURL library, which often comes with default shared hosting configurations, allows web developers to complete this task.

The Code

/* gets the data from a URL */
function get_data($url) {
	$ch = curl_init();
	$timeout = 5;
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

The Usage

$returned_content = get_data('https://davidwalsh.name');

Alternatively, you can use the file_get_contents function remotely, but many hosts don't allow this.

Recent Features

Incredible Demos

  • By
    GitHub-Style Sliding Links

    GitHub seems to change a lot but not really change at all, if that makes any sense; the updates come often but are always fairly small. I spotted one of the most recent updates on the pull request page. Links to long branch...

  • By
    Scroll IFRAMEs on iOS

    For the longest time, developers were frustrated by elements with overflow not being scrollable within the page of iOS Safari.  For my blog it was particularly frustrating because I display my demos in sandboxed IFRAMEs on top of the article itself, so as to not affect my site's...

Discussion

  1. Alternatively you can use the PHP DOM:

    $keywords = array();
    $domain = array('http://davidwalsh.name');
    $doc = new DOMDocument;
    $doc->preserveWhiteSpace = FALSE;
    foreach ($domain as $key => $value) {
        @$doc->loadHTMLFile($value);
        $anchor_tags = $doc->getElementsByTagName('a');
        foreach ($anchor_tags as $tag) {
            $keywords[] = strtolower($tag->nodeValue);
        }
    }
    

    Keep in mind this is not a tested piece of code, I took parts from a working script I have created and cut out several of the checks I’ve put in to remove whitespace, duplicates, and more.

    • karthik

      hi

      I am using external url to valid or invalid page. how to get success or failure without using curl

  2. For your script we can also add a User Agent:

    $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    
    Some other options I use:
    curl_setopt($ch, CURLOPT_FAILONERROR, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    
    • Ghulam Rasool

      Hi Shawn,
      using user agent as option helped me to sort out my problem.

      Thanks

  3. Excellent additions Shawn — thank you for posting them!

  4. And with this great power, comes great responsibility =)

  5. Very true Chris. It’s up to the developer to use it for good or evil. I suppose I’ve used it for both in the past.

    For downloading remote XML or text files, this script has been golden.

  6. KP

    Great script! Does anyone know how to use that script to save the content it gathered and save it to a file locally on the server?

  7. @KP: Check out my other article, Basic PHP File Handling — Create, Open, Read, Write, Append, Close, and Delete, here:

    http://davidwalsh.name/basic-php-file-handling-create-open-read-write-append-close-delete

  8. Usman

    I am trying to use this function “get_data($url)”, but it gives blank page when I echoed it. Anybody can please help me?

  9. @Usman: There are a few reason why you may get a blank page. You may not have CURL installed on the server. The other possibility is that you need to “echo” the content before you close the connection — someone brought this issue to me the other day.

  10. Usman

    Hello David,
    I am still unable to get result of it, I have checked(using phpinfo()) that CURL is installed. But its giving blank page. When I tried it from php command line its working.

    • Hi, this script works for me but unfortunately fails on urls from same domain as calling script. i cant see any error in error.log

  11. Dru

    Works like a charm!

  12. Works just like…. file_get_contents! Thanks.

  13. Ajay

    The code is very effective. but the problem is it returns all the html tags like and others. so is there anyway to get rid of it?

    • phpBeginner

      Use strip_tags($textRetrieved); This will return the string with no tags. I hope this helps.

  14. bit

    this code is way too short, even php.net probably has a longer version! beware if you use this to enable other users to make the URL requests, they can easily use it to upload malicious code/whole new pages/huge files, like mp3s or movies, that will eat up all your bandwidth.

  15. Do you know of a way to have it click a link on a page. I’m trying to work with another companies registration form. Its a stupid asp page. On the first page it puts ?DoctorId=13074 at the end of the url. On the next page with the registration form it dynamically makes a random string in a hidden input box that gets posted with the form. So is there any way I can have it click and link once it loads a page?

  16. Thomas Alexander

    hi,
    I’m using curl to get details from an api call, in one of my api call it returns a zip file,
    i like to force download that zip file , how can i do this with curl

  17. Joel Kiskola

    David Walsh code does not give anything to me.
    Why?

    I did include php tags before and after both codes.

  18. @Joel – cause you have to add : echo $returned_content after the last line ($returned_content = get_data('http://davidwalsh.name');)

  19. i want to onclick a link after getting contents of webpage
    how to do it?

  20. George

    I would like to remove the xml declaration from the returned url.

    I am appending the gathered data to an existing php/xml file and do not want it.

    is there a simple solution??

  21. FMdB

    hi!

    im trying to parse AJAX with PHP, problem is:

    when i read the URL SOURCE, the AJAX part isn’t visible, and i only grab HTML from the rest of site.

    how to solve this problem? any ideas?

  22. Kelly

    Is there a way to use curl in php like you can in the command line. aka

    curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”

    This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.

    If anyone nows the answer to this I would greatly Appreciate it.

  23. Is there a way to use curl in php like you can in the command line. aka

    curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”

    This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.

    If anyone nows the answer to this I would greatly Appreciate it.

  24. Is there a way to use curl in php like you can in the command line. aka

    curl http://mydomain.com/picture.jpg -o “PATH_TO_SAVE_TO”

    This would download a picture from a website and put it in a folder on my server. It works from Terminal but i cannot find the equivalent in PHP.

    If anyone nows the answer to this I would greatly Appreciate it.

  25. george

    @Kelly: yes,

    place something like this in your php: 3 options,

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,
            "http://www.whateveryouwant.com.php.html.xml");
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $xml_language = curl_exec($ch);
    curl_close($ch);
    echo "$xml.php.html_whatever";
    

    you have options using curl:
    return the data in with database driven string. returns the data and appends it to your php, html,xml etc. VERY HANDY – esp. for flash and others, see: worldwideweather.com – forum,

    this trick allows flash too read an external xml file for its language and database info. using php to call the user-specific info you can write the flash xml on the fly – this script returns the users language interface for flash, php calls the xml – the user is spanish (language) ES and appending to the php xml call, the the file is read and writes this into the php script itself with , very very fast

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,
            "http://www.verdegia.com/Files/System/TEST/Language/M_TEXT_" . $line{"Language"} . ".xml");
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $xml_language = curl_exec($ch);
    curl_close($ch);
    echo "$xml_language";
    

    return the data in external xml file from php user specific database call ” string – gets data specific for user and generates file on the fly , xml, php, html whatever..:

    	
    $sql="SELECT * FROM $tbl_name WHERE username='$myusername'";
    				$results = mysql_query($sql);
    		while($line=mysql_fetch_assoc($results)) {
    				$file = "http://www.worldweatheronline.com/feed/weather.ashx?q=" . $line{"Postcode"} . "&format=xml&num_of_days=5&key=6c7e92e827155910100801";	
    				}
    				
    $ch = curl_init($file);
    $fp = @fopen("../Files/System/TEST/temp.xml", "w");
    curl_setopt($ch, CURLOPT_FILE, $fp);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_exec($ch);
    curl_close($ch);
    fclose($fp);
    $file = "../Files/System/TEST/temp.xml";
    $fp = fopen($file, "r");
    

    HOPE THIS HELPS

  26. Can we use this function to parse all content in a url?

  27. @Indonesia: Except there are a lot more options. I (believe) that it’s possible to get the whole HTTP response using CURL, and (believe) that that is not true with ‘file_get_contents())

  28. $sql =  "UPDATE staff SET 
    		staffNo = $staff_no,
    		f_name=$fname, 
    		l_name=$lname,
    		sex=$sex,
    		DOB=$dob,
    		position=$position, 
    		salary=$salary,
    		hiredate=$hiredate,
    		contact_id=$contact_id,
    		branchNo=$branch_no
    		WHERE staffNo=$staff_no";
    $query = mysql_query($sql) or die("Cannot query the database." . mysql_error());
    echo "Database Updated.";
  29. Great,

    It is usefull to get xml or images from other site. if server is not able to get content from fopen.

    Thanks

  30. Great,

    It is useful to get xml or images from other site. if server is not able to get content from fopen.

    Thanks

  31. Nice,
    PHP provide other two method to fetch an URL – Curl and Fsockopen.
    To use they you can check this example : http://www.bin-co.com/php/scripts/load/

  32. how to write return into new file?

  33. Hi,

    Can you put the curl call in a loop, i have a list of about 1000 urls that i want to ‘hit’ so the caches build up, can i just chuck the above code into a loop or will that be too resource heavy?

    Thanks

    Pete

  34. Thanks for the code..Great!

  35. Myister

    It is very possible to put this into a automatic crawler for user inputted sites or even make a automatic crawl out of this… The code is short but it works for only one page at a time.. To make it look at multiple pages you have to do some minor PHP coding but nothing major…

    I am working on a script right now that works using the code above and just keeps crawling based on the links that on on the initial web page Crawled. A non stop Spider script! They are already out there but I like to say I can make one too…

    The script will also take the Meta tags ( Description and Keywords and place them into a database too. Thus giving me a search engine and not a user submitted directory…

    if you would like to join the team simply e-mail me at justin2009@gmail.com

  36. Learnphp123

    I want to extract the images present in the URL and first paragraph from the url. How can I do that?

    • Hi there I can’t post code here so I can provide you mine class which extract the purticular tag from the return page it could be any html tag.

      Regards,
      M. Singh

  37. Comrade

    A simple question..how to accelerate the downloading process using cURL it is damn slow…takes sometimes 45sec to download 4kb page

  38. Comrade

    A simple question..how to accelerate the downloading process using cURL it is damn slow…takes sometimes 45sec to download 4kb page

    • Ivan Ivković

      It depends upon the configuration of the host server.

  39. How to work with https. I have site which not loading when I try to open https it simple return 405 error. any help please mail me at msingh@ekomkaar.com

  40. http//www.zerospeedsensor.com/

  41. ie it doesnt return any output.

  42. Octav

    Very strange. This function returns only relative small pages.
    It works if the source code has under 200 lines.
    If the web page is bigger won’t return anything. Not even errors.
    Same thing happens with file_get_contents.

    PHP Version 5.2.11
    memory_limit – 128M
    max_execution_time – 30
    error_reporting(E_ALL)

    Any idea?

  43. Hey David,

    I did searched on net to find rough code by which i can get “Reciprocal” back links status.

    This helps me finally. :)

    I do modify it as per my need.

    To check backlinks
    0)
    {
    echo ‘found’;
    }else{
    echo ‘Not found’;
    }
    }

    $remote_url = ‘http://www.listsdir.com/’;
    $mmb_url = ‘http://mymoviesbuzz.com/titles/’;

    $returned_content = get_data($remote_url,$mmb_url);
    ?>

    Thanks.

  44. Thanks, this script help me to move my wordpress content to new host.

  45. harish

    I think good practice to use CURLOPT_USERAGENT in cURL scripts…

  46. In order to read sites encrypted by SSL, like Google Calendar feeds, you must set these CURL options:

    curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false);
    curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
    
  47. Hello David,
    How can I download a file from remote url? I’ve try using your method but no luck :(

  48. how can login by curl

  49. How repeat the process

  50. Thunderbird

    How can I download the contents of a website that requires login??

    • You need to set session for that and pass them with header so they can use as normal login process. For further details you can contact me at msingh@ekomkaar.com

    • Peter

      Hi,
      I’m trying to download contents of a website that requires login, but my script is not working.
      Could you help ?
      Thanks.

  51. Vinoth Kumar

    I’m running Web hosting Website. There My Domain Provider gave me some HTTP API’s. I tried to implement them but i’m getting empty response from curl. Its a HTTPS url and i used

    curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false);
    curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
    

    params in my curl. But still getting empty response. Can anyone help me in this! I’m new to cURL :'(

  52. Giu87

    It is possible to retrieve the code inserted into html tag (i.e. flash)?

    More precisely, @ the linkedin page of a skill:

    http://www.linkedin.com/skills/skill/Java?trk=skills-pg-search,

    there is a graphic obtained by an tag, which returns an image.

    When I take source code of the page or when I use file_get_contents() php function, I can obtain only the returned tag.

    I can see on the Firefox analysis of the page all these information, but I want an automatic script.

    Any solution?

  53. Mika Andrianarijaona

    Thank you, the code is working fine for me. You are saving me ;)

  54. andrei

    thank you! i looked for this code quite some time.

  55. David

    Thank you!

    I wasn’t fully getting it, and just want a script I could copy and paste, make sure things work working, then modify from there. This was the only one I could find that actually return the content. Thank you!

    I’ve been here on a few occasions and appreciate every aspect of the user friendly design! And every article is quality. Thanks!
    (Wow, being positive in a somewhat general way like that kind of resembles the ever infamous spam comments. Sorry :-/ )

  56. Thanks a lot for the script!

  57. I am trying to run curl on localhost, I have changed php.ini. No errors a blank page only coming..is there any other settings in php.ini or apache settings?

  58. Tom

    When I use the PHP curl function, it always wants to first return (as in echo) the contents of the URL when I only want it assigned to a variable. Is there a way to stop this from happening? I used your code exactly and simply called it from the main program. The behavior is the same if I call the php program from the command line or from via a browser.

  59. Tom

    Please ignore my previous post. For some unknown reason, I was overlooking a simple echo statement in the midst of my sloppy code. Duh…

    It works just fine!

  60. Thanks . It works. i have one question. is it possible to filter the result. i mean, i want to publish some contents and some not.
    Thank You

  61. Only thing that may be missing is potential redirects, potential sessions, and maybe a few other thing (browser as mentioned),. E.g. if you will download a file you will often be redirected or you will need to use sessions. The solution to this will something like this:

    curl_setopt($s,CURLOPT_FOLLOWLOCATION,1);
    curl_setopt($s,CURLOPT_COOKIEJAR, '/tmp/cookie.txt');
    curl_setopt($s,CURLOPT_COOKIEFILE,'/tmp/cookie.txt');
    
  62. Thanks . It works.

  63. $url = "https://api.dailymotion.com/video/xz9frh?fields=price_details";        
            $ch = curl_init();
    	$timeout = 5;
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    $returned_content = $data;
    echo $returned_content;
    $error = substr($returned_content, 2, 5);
    echo $error;
  64. I have a question, sorry but the code above don’t work for me because i’m not familiar with PHP CURL.
    I have a form and an image within the form, but basically its a certificate. The user has two choices, either print or download. how can i download into am image/jpeg content-type..

  65. Breen

    Nice. This is the preferred way to get HTML.

    file_get_contents for URLs is getting close to being a train wreck. It can be turned off or on unpredictably by hosts, and it seems incompatible with many modern linux distributions out of the box. file_get_contents seems to use its own rules for name resolution and often times out or is extremely slow. THere seems to be no consistent fix for this.

    Don’t use file_get_contents. Use cURL. Combined with the Simple DOM Parser, it is powerful stuff.

  66. shafiul

    thanks for the code,it works well when i try to store the contents of a page from the intranet or local server but it is not working when i m trying to load a page from the internet say http://www.google.com or any other sites. So, if there is any solution to this problem please mail me.

  67. This curl code is extracting page as whole. Am i able to extract some part from inside page .. Ex: i want to extract a portion in between ? what else code i will use to extract?

  68. I implement

    $datanya = get_data('http://data.alexa.com/data?cli=10&dat=snbamz&url=pasarkode.com');
    
    print_r($datanya);
    
    function get_data($url) {
    	$ch = curl_init();
    	$timeout = 5;
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    	curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    }
    

    Doesn’t work, what wrong with my code?

  69. Laurent

    Thanks a lot @Vinay Pandya !
    I was trying to figure out why I could not download files from HTTPs urls.
    I was like crazy as nuts because “curl_setopt($ch, CURLOPT_SSLVERSION,3);” didn’t work but your code is good.

  70. Kemal

    Thank you!

  71. I am trying to add a piece of code which gets a url and displays content on that page in an article form the web using this block of code. I am getting nothing, it will not do anything. I have activated the php plugin. I am on a Joomla 3.4.6 version. The page I am trying to show the code on is here http://www.chefjamie.com/2015/index.php/features-2/layouts

    /* gets the data from a URL */
    function get_data($url) {
    	$ch = curl_init();
    	$timeout = 5;
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    	curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    }
    
    $returned_content = get_data('http://melissas.com');
    
  72. Alex

    Hi all,

    I run the code and get a blank page. When I add an echo $returned_content, I don’t get the source code but the page itself.
    If I execute curl -s 'http://download.finance.yahoo.com' on command line I get the source code.

    Can anyone help me?
    Thanks
    Alex

  73. achi

    Can you please help to find a solution for my problem with Curl .
    I wrote a script that allows me to use CURL to have information on streaming links. I managed to write the script for the streaming links that are hosted in the streaming website, and I was able to get the information from the servers, Also I use the command WGET when I want to download the link.
    My problem is: how to use CURL or WGET to get a response that the link exists ( the link work with VLC or in KODI ) and it is valid in the server like this link: ( i got the links from KODI )
    I mean that i want to use CURL or WGET with kodi links to get information from the server
    The purpose of the request is how to prove that the link exists. With the curl command, I have a forbidden return 403 while the link is functional via kodi. Here is my script and example of a link for example :
    URL –>http://dittotv.live-s.cdn.bitgravity.com/cdn-live/_definst_/dittotv/secure/zee_cinema_hd_Web.smil/playlist.m3u8
    Also i tried : wget -a –spider myurl –> i receive a 8 code returned.

    Thank you for yout time Sir

    The Script that i use :

    #!/bin/bash
    declare ans2=Y;
    while [ $ans2 = "Y" ];
    do
    read -p "URL to check: " url
    if curl -v -i --output /dev/null --silent --fail "$url"; then
      printf  "$url --> The link exist !!:"
    else
      printf "$url --> The link does not exist !!"
    fi
    printf 'Want you show the cURL information from the Streaming Link? (Y/N/Q):'
    read -p " Your Answer :" ans
    if [ $ans = "Q" ]; then 
    exit 
    fi
    if [ $ans = "Y" ]; then curl -v -i "$url"
    else printf 'OK ! No Prob ! -->  Next Question:' 
    fi
    printf 'Want You download the streaming video from the streaming server? (Y/N/Q):'
    read -p "(Y/N/Q):" ans3
    if [ $ans3 = "Q" ]; then 
    exit 
    fi
    while [ $ans3 = "Y" ]
    do
    if curl --output /dev/null --silent --head --fail "$url"; then
    wget "$url"
    else 
    printf "$red" 'The link is Down ! No file to download'
    fi
    exit
    done
    if [ $ans3 = "N" ]; then
    printf 'OK ! No Prob ! -->  Next Question:'
    fi
    printf 'Want You check another URL ? (Y/N):'
    read -p "(Y/N):" ans2
    if [ $ans2 = "N" ] ; then 
    printf "$red" "Good Bye - Thank you !!"
    fi
    done
     
  74. Shafiq

    Is it possible to download the large file to server for example 500MB or 1 GB file through this process.

    Is there any way to do it by jquery & ajax to make it more userfriendly.

    Thanks for your answer in advance.

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!