Battle Log: Symfony Routing performance considerations.

Last week I took a deep dive into Symfony's Routing Component. A project I worked on suffered from a huge performance penalty caused by a routing mistake. This lead me on the path to discovering some interesting performance considerations. Some common practices align nicely with Symfony's optimisations, let's look into those.


TL;DR: Skip ahead directly to the performance tips.

But before we check out the tip, let's look at how I got here. In short, the application I was working on was crashing and we didn't know why.

Digging into a performance penalty.

The application suffered from performance issues, which slowly worsened over a couple months. Previously the average response was served within around 120 to 160ms. That number grew somewhat out of control. Before hitting a critical state, the application response times sometimes topped at a whopping 750ms. This was not an acceptable situation. Luckily most users weren't affected because we cached the responses heavily with Varnish. Varnish is awesome, but like any form of caching it is great at hiding problems. Needless to say, the problems couldn't be kept hidden any longer.

Due to a real-life event the application was getting hit with 6 times the normal amount of traffic. The process which caused the longer response-times was CPU-bound. Under extremely heavy load, the web nodes were hammered and maxed out CPU usage. Throwing some extra metal at the problem gave us room to breathe in the short term. For the long term this was not a viable strategy. We really needed to get to the bottom of this.

Profiling for performance and glory.

In order to get a better understanding of what we were dealing with we began to explore some options. With the help of Blackfire I was able to track down the bottleneck quite easily. For some reason routing was taking an extremely long time. Digging a little deeper some other strange things came to light. For example, there were calls made to the YamlFileLoader class. For Symfony, this should only happen in dev-mode.

We durped.

Due to some misconfiguration the route generation was not as strict as some of the developers liked it to be. More specifically, it allowed for routes to have mixed localised segments. Our routes are prefixed with a locale segment (/en or /nl). The first part was matched with a regex, making it possible to access URLs where the language prefix is not in the same locale as the slug. While this was not a huge concern, a developer decided to validate generated routes against the collection of routes gathered by Symfony and reject the ones with mixed locales. This approach had some flaws.

First off, fixing the root cause is always better than to fight symptoms. Second, verifying all the generated URL's against a collection of routes in itself is a rather expensive operation because it requires all the routes to be loaded. I'll dig into that a bit more later. First let's get back to how we found the root cause.

Finding the root cause.

While it's not always easy to find the root cause of a problem, there's a relative simple trick to come closer to the root cause, called The Five Why's. Basically this technique tells you to ask "why" five times to dig deeper into the problem space.

The 5 Why's.

For us this was:
Problem: Our application servers are crashing.

  1. Why? Because CPU usage is through the roof.
  2. Why? Because every request performs more CPU intensive tasks.
  3. Why? Because routing has become insanely expensive.
  4. Why? Well, because we seem to be parsing all the route files? What the hell?
  5. Why? I have no idea, I should really dig into this.

In our case we got to the root cause one step quicker than the five step program.

Death by a thousand lines of YAML.

As it turned out, the generated URL validation was addressing a RouteCollection which contains all the application routes. However in most route related cases (matching requests and generating urls) Symfony uses an optimised bit of code. This generated piece of code eliminates the need for loading all the routes. When you DO in fact request the route collection, all the route providers come into play. These providers resolve all the routes defined in an application. These providers perform code scans, annotation checks, or (in our case) YAML parsing. If there's anything you DO NOT want to be doing EVERY SINGLE HTTP REQUEST is parsing over 900 routes worth of YAML.

Luckily, we were able to remove the need for obtaining the route collection entirely which drastically improved the performance of our application. It brought the routing times back to ~7.5ms, which is nice. The response times of the system dropped back to pre-apocalypse norms.

The team was able to propose many improvements to prevent situations like this to happen in the future, and there were plenty of lessons to be learned. It also lead to some great insights into Symfony's Routing components.

So let's dive into some Symfony Routing considerations!

Staring into the optimised code abyss.

While I was on this epic quest to prevent our application from being crushed under the load I had laid my eyes on the code that Symfony had generated to speed up the routing process. This lead to some interesting insights. Symfony has some clever tricks to increase the performance of routing.

Static prefix optimisation.

The optimised routes can eliminate groups of routes using a static prefix optimisation. The compiler tries to figure out common prefixes, matches the URI against said prefix and skips groups of routes if the prefix doesn't match. This improves performance because these checks are simple string comparisons. For example, if we have the following routes defined (pseudo code):

1. /blog/overview => BlogController::indexAction  
2. /blog/post/new => BlogController::viewAction  
3. /blog/post/{slug} => BlogController::viewAction  
4. /about-me/cv => AboutController::meAction  
5. /about-me/social => AboutController::socialAction  

The generated code will check if the path matches the static part of the routes first:

<?php

if (strpos($uri, '/blog') === 0) {  
    // Try to match route 1

    if (strpos($uri, '/blog/post') === 0) {
        // Try to match route 2 and 3
    }
}

if (strpos($uri, '/about-me') === 0) {  
    // Try to match route 4 and 5
}

Because of this, groups of routes with a common static prefix can be discarded from further matching entirely. As you can see in the code above, it also tries to optimise for subgroups. In large applications this can lead to quite significant performance improvements. While many other routing solutions are a little quick on the draw, going straight for regular expressions, Symfony doesn't. It's important to understand that regular expressions are great, but a simple string comparison will always outperform a regex.

Simple static route matching.

Whenever possible Symfony will try to match a route as simple as possible. When a URI is fully static (no dynamic parts) it will simply perform a string comparison.

if ($uri === '/simple/uri/here') {  
   return $matchedRoute;
}

However, as soon as it detects a dynamic part it'll need to do some matching using regular expression. These operations are more expensive than simple string comparisons. But yet again, Symfony has a trick up its sleeve to avoid these penalties. Before it tried to match the URI it'll check to see if a static prefix is present first. For example a route like:

/static/{dynamic}
/other-{pieces}

Will be checked like this:

if (strpos($uri, '/static/') === 0 && preg_match('#^/static/(?P<dynamic>[^/]++)$#s', $uri)) {  
    // Route 1 matches!
}

if (strpos($uri, '/other-') === 0 && preg_match('#^/other\\-(?P<segment>[^/]++)$#s', $uri)) {  
    // Route 2 matches!
}

This optimisation looks a lot like the static prefix optimisation. However, in this case its only job is to prevent any (more expensive) preg_match calls. As you can see in the second route, it's not limited to segments. The compiler will try to match as much as possible.

Performance considerations

Knowing these performance optimisations we can look at our applications and try to benefit from them.

1. Try to group routes that share common prefixes.

Symfony does its best to eliminate groups of routes based on common static prefixes. However, Symfony also respects the order in which routes are defined. So in order to have the most advantage of this optimisation, try to group routes with common prefixes. The added benefit of this approach is that you'll end up with better URI design/semantics.

2. Try to keep dynamic parts as far to the right side as possible.

Matching of dynamic routes is more expensive due to preg_match calls. Moving dynamic parts of the URI to the right allows Symfony to make better use of the static prefix checks.

3. Consider using less dynamic parts.

Every route that has dynamic parts is inherently more expensive to match. In larger applications this can result in a lower baseline performance. At some point it could be worthwhile to consider cutting down on the amount of dynamic routes. For example; if you have a dynamic part in a URI, but there will only ever be 4 variants of this route, consider writing them out! This may sound silly, but it'll cost you 5 minutes to write, but save you time on EVERY request.

4. Try to improve Symfony's Routing component.

Lastly, if you're up for it, try to optimise the code Symfony generates. This won't only benefit you, but the entire Symfony community. Coincidentally, that's just what I did. When I was looking through the generated code I saw some function calls which were called over and over. So I decided to optimise that by storing the value in a local value, preventing the need to keep calling the same functions over and over. Apart from that, I optimised how HTTP methods were matched in certain cases (which was a tip from @fabpot).

The cost of routing for the application I'm currently working on was brought down from ~7.5ms to below 2.5ms. For many applications out there this may not seem like much, but if your application is serving anywhere from 1.2k to close to 3k requests per minute those milliseconds begin to count. Apart from that it was a fun way to contribute back to community.

Now all I have to do is hope that the PR will be merged.

[EDIT] After this optimisation, I've made another PR.