Extract Content using PHP and Preview like Facebook

Last modified on December 15th, 2017 by Vincy.

While sharing URLs, if the client automatically extracts title and image to post it as preview, then that would be nice. In many social media websites like Facebook and LinkedIn we can see that feature. They extract title and meta information on sharing links. In this tutorial, we are going to extract the page title, meta description and images from the URL shared by the user.

grab

In this example, I used PHP and curl for extracting content and images from the given URL. When the user shares the link on to the given input text box, an AJAX call will be sent for a PHP page to process CURL request for extracting remote data. In a previous tutorial, we have seen how to extract content from a URL and parse using Simple HTML DOM Parser. After getting remote data the PHP code generates JSON with the title, description and image data and responds to the AJAX call to show preview on the browser.

jQuery AJAX call to initiate CURL

The following code shows the jQuery AJAX script to request remote data for the given input URL. When the user pasting the URL into the input field, this AJAX request will be sent to the get-data.php file. In this file, I process extracting remote content via CURL and return these content as an AJAX response.

After getting data from the URL, it will be previewed on the browser. If the URL contains more images then these image will be seen by clicking the previous and next navigations shown below the image preview.

<script type="text/javascript">
$(document).ready(function() {
    var image_src;
    $('#remote-url').on("keyup", function() { 	
				$("#output").html("");
				$("#loader").show(); 
				
				var remote_url = $(this).val();
			    var image_html = '';
                
				$.ajax({
                    url: "get-data.php",
                    type: "POST",
                    data: {'url': remote_url},
                    dataType: "json",
                    success: function(data, status){
                        	image_src = data.image_src;
        					total_images = 0;
                            if(data.image_src) {
                                total_images = parseInt(data.image_src.length-1);
                                current_image_position = total_images;
            					
                					if(total_images>=0){
                						image_html = '<div class="image-preview" id="image-preview"><img src="'+data.image_src[current_image_position]+'"></div>'+
                                        '<div class="prev-next-navigation"><span class="prev-img" id="prev-img"> </span><span class="next-img" id="next-img"> </span> </div>';
                					}
                            }
                         
                         cotent_html = '<div class="text-data"><a class="page-title" href="'+remote_url+'" target="_blank">'+data.title+'</a><div>'+data.body+'</div>';
        					var responseHTML = image_html + cotent_html;
        					
        					$("#output").html(responseHTML).show(); 
        					$("#loader").hide();
                    },
                    error: function () {alert("Problem in extracting data from the remote URL");}
                });
	});


	$("body").on("click","#prev-img", function(e){		
		if(current_image_position>0) 
		{
			current_image_position--;
			$("#image-preview").html('<img src="'+image_src[current_image_position]+'">');
		}
	});
	
	$("body").on("click","#next-img", function(e){		
		if(current_image_position<total_images)
		{
			current_image_position++;
			$("#image-preview").html('<img src="'+image_src[current_image_position]+'">');
		}
	});
});
</script>

PHP CURL Request to Extract Title and Meta From URL

The following PHP code shows how to get page title and other meta details by using CURL. In this code, I initialised the CURL object and set the URL to be accessed with the reference of this object. The CURL script will return the HTML content of the remote page.

After getting the HTML content, we need to parse the HTML by referring the title, meta and img tag names to get the page title, description and the image URLs, respectively. These data are encoded into a JSON array and returned as the AJAX response.

<?php
if (isset($_POST["url"]) && filter_var($_POST["url"], FILTER_VALIDATE_URL)) {
    
    // Extract HTML using curl
    $ch = curl_init();
    
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $_POST["url"]);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    
    $data = curl_exec($ch);
    curl_close($ch);
    
    // Load HTML to DOM Object
    $dom = new DOMDocument();
    @$dom->loadHTML($data);
    
    // Parse DOM to get Title
    $nodes = $dom->getElementsByTagName('title');
    $title = $nodes->item(0)->nodeValue;
    
    // Parse DOM to get Meta Description
    $metas = $dom->getElementsByTagName('meta');
    $body = "";
    for ($i = 0; $i < $metas->length; $i ++) {
        $meta = $metas->item($i);
        if ($meta->getAttribute('name') == 'description') {
            $body = $meta->getAttribute('content');
        }
    }
    
    // Parse DOM to get Images
    $image_urls = array();
    $images = $dom->getElementsByTagName('img');
     
     for ($i = 0; $i < $images->length; $i ++) {
         $image = $images->item($i);
         $src = $image->getAttribute('src');
         
         if(filter_var($src, FILTER_VALIDATE_URL)) {
             $image_src[] = $src;
         }
     }
    
    $output = array(
        'title' => $title,
        'image_src' => $image_src,
        'body' => $body
    );
    echo json_encode($output); 
}
?>

Extract Content using PHP and Preview like Facebook – Output

extract-remote-data-preview

Download

↑ Back to Top

Share this Article