A photographer friend of mine implored me to find and download images
of picture frames from the internet. I eventually landed on a web page
that had a number of them available for free but there was a problem: a
link to download all the images together wasn’t present.
I didn’t want to go through the stress of downloading the images individually, so I wrote this PHP class to find, download and zip all images found on the website.
The class uses Symfony’s DomCrawler component to search for all image links found on the webpage and a custom
Below is the list of the class properties and their roles.
1.
2.
3.
4.
5.
Let’s get started building the class.
Create the class
Create a
The method is quite self-explanatory.
The created ZIP archive has a folder that contains the scraped images. The
By default, the folder name is set to
At this point, we instantiate the Symfony
After the download is complete, we compress the image folder to a ZIP Archive using our custom
Lastly, we delete the created folder after the ZIP file has been created.
Get the status of the operation. I.e if it was successful or an error occurred.
Process all the methods above.
You can download the full class from Github.
Download and install the DomCrawler component via Composer simply by adding the following require statement to your
I didn’t want to go through the stress of downloading the images individually, so I wrote this PHP class to find, download and zip all images found on the website.
How the Class works
It searches a URL for images, downloads and saves the images into a folder, creates a ZIP archive of the folder and finally deletes the folder.
The class uses Symfony’s DomCrawler component to search for all image links found on the webpage and a custom
zip function
that creates the zip file. Credit to David Walsh for the zip function.Coding the Class
The class consists of five private properties and eight public methods including the
__construct
magic method.Below is the list of the class properties and their roles.
1.
$folder
: stores the name of the folder that contains the scraped images. 2.
$url
: stores the webpage URL. 3.
$html
: stores the HTML document code of the webpage to be scraped. 4.
$fileName
: stores the name of the ZIP file. 5.
$status
: saves the status of the operation. I.e if it was a success or failure.Let’s get started building the class.
Create the class
ZipImages
containing the above five properties.<?php
class ZipImages {
private $folder;
private $url;
private $html;
private $fileName;
private $status;
Create a
__construct
magic method that accepts a URL as an argument. The method is quite self-explanatory.
public function __construct($url) {
$this->url = $url;
$this->html = file_get_contents($this->url);
$this->setFolder();
}
The created ZIP archive has a folder that contains the scraped images. The
setFolder
method below configures this.By default, the folder name is set to
images
but the method provides an option to change the name of the folder by simply passing the folder name as its argument.public function setFolder($folder="image") {
// if folder doesn't exist, attempt to create one and store the folder name in property $folder
if(!file_exists($folder)) {
mkdir($folder);
}
$this->folder = $folder;
}
setFileName
provides an option to change the name of the ZIP file with a default name set to zipImages
:public function setFileName($name = "zipImages") {
$this->fileName = $name;
}
At this point, we instantiate the Symfony
crawler
component to search for images, then download and save all the images into the folder.public function domCrawler() {
//instantiate the symfony DomCrawler Component
$crawler = new Crawler($this->html);
// create an array of all scrapped image links
$result = $crawler
->filterXpath('//img')
->extract(array('src'));
// download and save the image to the folder
foreach ($result as $image) {
$path = $this->folder."/".basename($image);
$file = file_get_contents($image);
$insert = file_put_contents($path, $file);
if (!$insert) {
throw new \Exception('Failed to write image');
}
}
}
After the download is complete, we compress the image folder to a ZIP Archive using our custom
create_zip
function.public function createZip() {
$folderFiles = scandir($this->folder);
if (!$folderFiles) {
throw new \Exception('Failed to scan folder');
}
$fileArray = array();
foreach($folderFiles as $file){
if (($file != ".") && ($file != "..")) {
$fileArray[] = $this->folder."/".$file;
}
}
if (create_zip($fileArray, $this->fileName.'.zip')) {
$this->status = <<<HTML
File successfully archived. <a href="$this->fileName.zip">Download it now</a>
HTML;
} else {
$this->status = "An error occurred";
}
}
Lastly, we delete the created folder after the ZIP file has been created.
public function deleteCreatedFolder() {
$dp = opendir($this->folder) or die ('ERROR: Cannot open directory');
while ($file = readdir($dp)) {
if ($file != '.' && $file != '..') {
if (is_file("$this->folder/$file")) {
unlink("$this->folder/$file");
}
}
}
rmdir($this->folder) or die ('could not delete folder');
}
Get the status of the operation. I.e if it was successful or an error occurred.
public function getStatus() {
echo $this->status;
}
Process all the methods above.
public function process() {
$this->domCrawler();
$this->createZip();
$this->deleteCreatedFolder();
$this->getStatus();
}
You can download the full class from Github.
Class Dependency
For the class to work, the
Domcrawler
component and create_zip
function need to be included. You can download the code for this function here.Download and install the DomCrawler component via Composer simply by adding the following require statement to your
composer.json
file:"symfony/dom-crawler": "2.3.*@dev"
Run $ php composer.phar install
to download the library and generate the vendor/autoload.php
autoloader file.Using the Class
- Make sure all required files are included, via autoload or explicitly.
- Call the
setFolder
, andsetFileName
method and pass in their respective arguments. Only call thesetFolder
method when you need to change the folder name. - Call the
process
method to put the class to work.
<?php
require_once 'zipfunction.php';
require_once 'vendor/autoload.php';
use Symfony\Component\DomCrawler\Crawler;
require_once 'vendor/autoload.php';
//instantiate the ZipImages class
$object = new ArchiveFile('http://sitepoint.com');
// set the zip file name
$object->setFolder('pictureFrames');
// set the zip file name
$object->setFileName('myframes');
// initialize the class process
$object->process();
Source: sitepoint.com
No comments:
Post a Comment