Pipelines, immutability and privates.

A recent addition to the PHP League was the pipeline package, which allows you to compose sequential operations. One of the key aspects of the pipeline is that it's immutable. During the development of Pipeline, I learned a few lessons that I'd like to share with you.


When I started working on the Pipeline package, I wanted to create a clean implementation of the Pipeline design pattern. The pipeline design pattern is a very simple pattern. In its most simple form, it allows you to perform a series of operations on any given payload. Operations in the pipeline are referred to as stages. Each stage has the chance to operate on the payload and return a response. This response is passed onto the stage next in line. It's an architectural pattern, so it's used to create structure in otherwise messy code.

For example, this:

function doSomething(string $chars)  
{
    return $this->toUpperCase(
        $this->getFirst(
            $this->orderChars(
                $this->splitChars($chars)
            )
        )
    );
}

Can be written like:

function doSomething(string $chars)  
{
    return (new Pipeline)
        ->pipe([$this, 'splitChars'])
        ->pipe([$this, 'orderChars'])
        ->pipe([$this, 'getFirst'])
        ->pipe([$this, 'toUpperCase'])
        ->process($chars);

This already makes it easier to understand what's going on. The declaration of the stages is not a direct representation of how the payload moves through the different operations.

As mentioned before, the Pipeline is immutable. In our case, this means whenever you pipe a new stage onto the pipeline, you're actually receiving a new pipeline instead of a modified one.

So, you can say that:

$pipeline = new Pipeline([$stage]);
$newPipeline = $pipeline->pipe($secondStage);

assert($pipeline !== $newPipeline, 'Pipelines should be immutable');  

Not a lot was needed to accomplish this. The implementation was rather simple. All we needed to do was let the constructor accept the stages and create a new pipeline in the pipe method:

public function __construct(array $stages = [])  
{
    foreach ($stages as $stage) {
        if (false === is_callable($stage)) {
            throw new InvalidArgumentException();
        }
    }

    $this->stages = $stages;
}

public function pipe(callable $stage)  
{
    $stages = $this->stages;
    $stages[] = $stage;

    return new static($stages);
}

Not much funky stuff going on here really, but one thing annoyed me. Whenever a stage is added, the constructor re-checks every item in the stages array. Of course, it's not a check I was willing to give up, because a newly created pipeline should guard against valid arguments. Luckily, there's still a way around this. I remembered something about the visibility of PHP's properties which would remove the need for re-checking entirely.

When two objects are of the same type, they can access the private properties of the other. This might seem like a bug, but it's actually a very valuable feature. For instance, it allows you to perform very efficient equality checks without the need to publicly expose your internals.

For instance, a Money class might have an equality check:

final class Money  
{
    public function __construct(int $amount, Currency $currency)
    {
        $this->amount = $amount;
        $this->currency = $currency;
    }

    public function equals(Money $otherMoney)
    {
        return $this->amount === $otherMoney->amount
            && $this->currency->equals($otherMoney->currency);
    }
}

As you can see in the example, the Money::equals method encapsulates all the logic to be sure the equality check is performed. It does this without the need to call methods on the other object. This means that even if the class doesn't expose something, you can still take it into account during these kinds of checks.

Now that we know this, we can take a look at our Pipeline::pipe method and use what we know about private visibility. Instead of this:

public function pipe(callable $stage)  
{
    $stages = $this->stages;
    $stages[] = $stage;

    return new static($stages);
}

We can bypass the constructor check and inject our stages directly into the $stages property:

public function pipe(callable $stage)  
{
    $stages = $this->stages;
    $stages[] = $stage;
    $pipeline = new static();
    $pipeline->stages = $stages;

    return $pipeline;
}

public function __construct(array $stages = [])  
{
    foreach ($stages as $stage) {
        // This is not executed, $stages is empty
        if (false === is_callable($stage)) {
            throw new InvalidArgumentException();
        }
    }

    $this->stages = $stages;
}

This gives us the same functionality as before, but we're saving time by not running the is_callable check on every added stage. While this is a very minor performance improvement, classes with more extensive checks can greatly benefit from such changes. It mediates some of the costs that come with PHP's dynamic nature (which we love) while keeping the internal state of our object in order.

I hope this gives you some insight into the thoughts that went into the design process of the package and perhaps help you out in similar situations.

Happy coding!