the:chris:walker ↩

PHP Generator Pattern before 5.5

So, in PHP 5.5 we will have generators. That’s great news, but not many people are running 5.5 yet – so what can the rest of us do?

First, let’s look at what generators are and why they are useful.

Generators are like a lovechild of closures and iterators. A function is a generator if it uses a new keyword yield, and they can be used as iterators. As an example, let’s take the “iterate through lines in a file” problem. A simple solution uses the file method.

<?php
foreach (file($file) as $line) {
  doWork($line);
}

This obviously has the drawback that we have to read in the whole file before we can iterate. On massive files and limited memory this will be a problem.

So we could use lower constructs, fopen, fgets to perform it in a streaming way.

<?php
$fh = fopen($file);
while( ($line = fgets($fh)) !== false ){
  doWork($line);
}
fclose($fh);

This is good, but it’s hard to abstract our code to doWork on sources that don’t look like file pointers. Bring on generators.

<?php
function generator($file){
  $fh = fopen($file);
  while( ($line = fgets($fh)) !== false ){
    yield $line;
  }
  fclose($fh);
}

$iterator = generator($file);

foreach($iterator as $line){
  doWork($line);
}

Now it would be easy to swap in any generator, or iterator, and this will still work.

Recently I wrote a piece of code for a CLI tool to do some work with a bunch of given user IDs. This tool needed to work out what to do based on a number of different inputs. It could be piped into STDIN, or an ‘ALL’ argument (meaning go to the DB and get everyone), or as a string of CLI arguments.

I wanted to use generators to generate my IDs to work with. But alas I was restricted to PHP 5.3.

Now, I didn’t want to lose the cleaness of code that comes with generators, and so I fell back to closures, and surprisingly, they come pretty close to the main functionality. Code talks, so here is how I would have approached the line problem.

<?php
function generator($file){
  $fh = fopen($file);
  return function() use (&$fh) {
    if( ($line = fgets($fh)) !== false ){
      return $line;
    }
    fclose($fh);
    return null;
  };
}

$iterator = generator($file);

while( null !== $line = $iterator() ){
  doWork($line);
}

This is close to the generator style, but instead of the foreach loop we have a more complex while structure. However if we accept this, then it would be easy to create compatible “generators”, from many sources. In my original problem, reading different input types the code ended up looking like this:

<?php
switch ($input_method) {
  case "stdin":
    $generator = function(){
      if(false !== $line = readline(STDIN)){
        return line;
      }
      return null;
    };
    break;
  case "args":
    array_shift()
    $generator = function() use (&$argv) {
      return array_pop($argv);
    };
    break;
  case "all":
    $db_cursor = $database->users->find();
    $generator = function() use (&$db_cursor) {
      if($db_cursor->hasNext()){
        return $db_cursor->getNext();
      }
      return null;
    };
    break;
}

//now it's easy to do work
while( null !== $item = $generator() ){
  doWork($item);
}

This pattern should tide me over until PHP 5.5 is more widespread.