6.6 Before and After: Recursive Directory Iteration

The previous iterator examples handle only a flat list of items, but frequently your lists contain other lists. For instance, a directory can have other directories inside it, those child directories can contain additional directories, and so on.

Solve this problem with a recursive iterator, an iterator that works with multilevel lists. The following examples demonstrate directory iteration for subdirectories.

6.6.1 PHP 4: Recursively Reading Files in a Directory

In PHP 4, the easiest way to process all the files in a directory and its children is to call a function recursively:

function iterate_dir($path) {

    $files = array( );

    if (is_dir($path) & is_readable($path)) {

        $dir = dir($path);

        while (false !=  = ($file = $dir->read( ))) {

            // skip . and .. 

            if (('.' =  = $file) || ('..' =  = $file)) {

                continue;

            }

            if (is_dir("$path/$file")) {

                $files = array_merge($files, iterate_dir("$path/$file"));

            } else {

                array_push($files, $file);

            }

        }

        $dir->close( );

    } 

    return $files;

}



$files = iterate_dir('/www/www.example.com');

foreach ($files as $file) {

  print "$file\n";

}

email.html

logo.gif

php.gif

auth.inc

user.inc

index.html

search.html

This function loops through every file in the current directory. If the file is a directory, the function recursively calls itself and passes the subdirectory name as the argument. These results are then merged back into a master list of files stored in the $files array. When a file is not a directory, it's added to the list using array_merge( ).

Repeatedly calling iterate_dir( ) is slow, but it allows you to order the files so that children live under their parents.

Prepending $path to the filename before it's merged into the $files array modifies the example output to include the full path:

function iterate_dir($path) {



    // same as before 



    if (is_dir("$path/$file")) {

        $files = array_merge($files, iterate_dir("$path/$file"));

    } else {

        array_push($files, "$path/$file");

    }



    // same as before

}



$files = iterate_dir('/www/www.example.com');

foreach ($files as $file) {

    print "$file\n";

}

/www/www.example.com/email.html

/www/www.example.com/images/logo.gif

/www/www.example.com/images/php.gif

/www/www.example.com/includes/auth.inc

/www/www.example.com/includes/user.inc

/www/www.example.com/index.html

/www/www.example.com/search.html

6.6.2 PHP 5: Recursively Reading Files in a Directory

PHP 5 replaces that complicated work with a RecursiveDirectoryIterator. Use it like this:

$dir = new RecursiveIteratorIterator(

         new RecursiveDirectoryIterator('/www/www.example.com/'));

foreach ($dir as $file) {

    print "$file\n";

}

email.html

logo.gif

php.gif

auth.inc

user.inc

index.html

search.html

No, that's not a typo?there really is a class named RecursiveIteratorIterator. This class is an Iterator for classes that implement the RecursiveIterator interface. Think of PHP as automatically implementing an IteratorIterator as part of foreach, but to ensure children are properly traversed, you need to use this SPL class.

Also, if you look closely, this output is different than that returned by the PHP 4 example. Something has filtered out all the directories from the listing. To fix this, pass true as a second argument to the RecursiveIteratorIterator:

$dir = new RecursiveIteratorIterator(

         new RecursiveDirectoryIterator('/www/www.example.com/'), true);

foreach ($dir as $file) {

    print "$file\n";

}

email.html

images

logo.gif

php.gif

includes

auth.inc

user.inc

index.html

search.html

That's still not the most readable output, because you don't know which files live in which folders. Prepending the path helps:

$dir = new RecursiveIteratorIterator(

        new RecursiveDirectoryIterator('/www/www.example.com'), true);

foreach ($dir as $file) {

    print $file->getPathname( ) . "\n";

}

/www/www.example.com/email.html

/www/www.example.com/images

/www/www.example.com/images/logo.gif

/www/www.example.com/images/php.gif

/www/www.example.com/includes

/www/www.example.com/includes/auth.inc

/www/www.example.com/includes/user.inc

/www/www.example.com/index.html

/www/www.example.com/search.html

A quick call to the getPathname( ) method solves the problem. You may notice that you can call both print $file and $file->getPathname( ). Is $file a string or an object? It's an object, but it has a _ _toString( ) method that returns the file's name when it's printed.

Alternatively, it would be nice to display these files using a pretty directory tree-style listing. For that, you need to implement a custom class that extends RecursiveIteratorIterator:

class DirectoryTreeIterator extends RecursiveIteratorIterator {



    function current( ) {

        return str_repeat('| ', $this->getDepth( )) . '|-' . 

            parent::current( );

    }



}



$dir = new DirectoryTreeIterator(

         new RecursiveDirectoryIterator('/www/www.example.com/'), true);



foreach ($dir as $file) {

    print $file. "\n";

}

|-email.html

|-images

| |-logo.gif

| |-php.gif

|-includes

| |-auth.inc

| |-user.inc

|-index.html

|-search.html

The DirectoryTreeIterator class, instead of returning a simple object that represents the file, provides a graphical representation of the directory hierarchy.

Implementing this class requires a new method that's part of RecursiveIteratorIterator: getDepth( ). The getDepth( ) method returns a number that indicates how many levels down the current item lives. For files in the top-level directory, it returns 0; for files inside the images and includes directories, it returns 1; and so on.

With getDepth( ), it's easy to prepend a pipe and space (| ) for each level beyond the first. You could use a for loop here from 0 to getDepth( ), but it's faster to use str_repeat( ). After those characters come a pipe and dash (|-) and then the filename. This produces a simple graphic directory tree.