6.4 Chaining Iterators

An important iterator feature is the ability to chain them together so they can act as a series of filters. For example, you can use iterators to restrict results to words that match a particular regular expression or to return only the first 10 results.

I call these types of iterators meta-iterators. SPL comes with two meta-iterators: FilterIterator, to filter results, and LimitIterator, to limit results.

6.4.1 Filtering Results with FilterIterator

FilterIterator is an abstract class that implements all the methods of a regular Iterator. However, it has a twist?you must define an accept( ) method that controls whether an item should be returned or filtered out from the results.

Unlike DirectoryIterator, which is directly instantiable, you cannot create a new FilterIterator. Instead, you must extend it and implement accept( ).

Here's an example that filters by a Perl-compatible regular expression:

class RegexFilter extends FilterIterator {



    protected $regex;



    public function _ _construct(Iterator $it, $regex) {

        parent::_ _construct($it);   

        $this->regex = $regex;

    }



    public function accept( ) {

        return preg_match($this->regex, $this->current( ));

    }



}

RegexFilter takes two arguments in its constructor: an Iterator to filter and a regular expression pattern to use as a filter. The first parameter is passed on to the parent FilterIterator constructor, because it handles the iteration for your class.

The regular expression (regex for short) is stored in a protected property, $regex, for use inside the accept( ) method. This method must return true for items you wish to return and false for the ones that should be removed.

Handily, preg_match( ) returns the number of times the pattern matches the string. This number is always 0 or 1. (For multiple matches, use preg_match_all( ).) Since 1 evaluates as true and 0 as false, you can directly pass along preg_match( )'s return value.

Because $regex is a Perl-compatible regular expression, you must place pattern delimiters around your regex. The regular expression is checked against the value of $this->current( ). Alternatively, you could check it against $this->key( ), depending on how you want the iterator to work.

This example allows you to use RegexFilter inline with a foreach loop and DirectoryIterator:

$dir = new DirectoryIterator('/www/www.example.com/') {

$filtered_dir = new RegexFilter($dir, '/html$/i');



foreach ($filtered_dir as $file);

    print "$file\n";

}

email.html

index.html

search.html

You pass RegexFilter two arguments, an iterator and the regular expression pattern. This example returns files ending with html and eliminates the others. The regular expression uses the /i modifier to do a case-insensitive check.

When you only want to iterate over the objects once, you can create both RegexFilter and DirectoryIterator inside the foreach:

foreach (new RegexFilter(

          new DirectoryIterator('/www/www.example.com/'), '/html$/i') 

          as $file) {

    print "$file\n";

}

email.html

index.html

search.html

6.4.2 Limiting Results with LimitIterator

Another useful meta-iterator is LimitIterator. This Iterator, which behaves just like the SQL LIMIT clause, allows you to filter nondatabase listings with the same logic you use with a database.

This example returns only the third and fourth files in the directory:

foreach(new LimitIterator(

          new DirectoryIterator('/www/www.example.com/'), 2, 2) 

          as $file) {

    print "$file\n";

}

email.html

images

Unlike FilterIterator, LimitIterator doesn't require you to implement a method; therefore, it's directly instantiable. You don't need to extend it.

Its constructor takes three arguments: the iterator, the start position, and the number of items to return. Items start at position 0, so the third is at position 2. That's why this example passes 2, 2 as the second and third parameters.

Of course, you can chain three Iterators in combination. For example, to find only the first file ending in html in the /www/www.example.com/ directory:

foreach(new LimitIterator(new RegexFilter(

          new DirectoryIterator('/www/www.example.com/'), '/html$/i'), 0, 1)

          as $file) {

    print "$file\n";

}

email.html

It's important to order the Iterators correctly, or you won't get the results you expect. Don't switch RegexFilter and LimitIterator:

foreach(new RegexFilter(new LimitIterator(

          new DirectoryIterator('/www.www.example.com/'), 0, 1), '/html$/i') 

          as $file) {

    print "$file\n";

}

This prints nothing! LimitIterator returns only one record?the file named ".". Then, FilterIterator comes along, sees that the record doesn't end in html, and eliminates it. This leaves you with no results.