Deploying new features at AutoScout24
One of the things that some newcomers are surprised about when joining AutoScout24 is the way we deploy new features.
In this article I will try to give an overview of how we do it and some of the reasons behind it.
No staging environments #
When we started to migrate our monolith to AWS based micro-services, we decided that we no longer wanted to have staging or pre-production environments. Instead, we would only maintain production, and our engineers would deploy directly to it.
Having several environments increases deployment complexity since one needs a mechanism to understand what code is deployed in which environment.
Maintenance costs are also higher, but more importantly, it also increases the time it takes for a release to make it to production.
Continuous delivery #
Releasing our code directly to production in an automated way (that is, without any manual triggers) is the way we make sure we do continuous delivery. We think there’s no better way for it than continuous deployment — that is, to have every commit to master being released to production, without any intermediate stop other than running the tests.
One important aspect around this is that we want our product engineers to feel as comfortable as possible when releasing new code to our customers and consumers, and we think the best way to feel comfortable about something is to do it as many times as possible. At AutoScout24 our engineers release new code dozens of times a day, without thinking about it, just by using trunk based development and doing small commits.
Engineers release code to production dozens of times a day.
Another important aspect is that we optimise for MTTR over MTBF. That is, we prefer to be able to recover from a failure very quickly than to try and make sure that failures never happen. To enable ourselves to do that, we invest in excellent monitoring and alerting, even favouring it over testing if necessary (though we try to excel at both things!).
Test in production #
One consequence of not having any environment other than production is that manual testing, as in UX checks or QA tests, is done directly in production. Rather than a problem, we see this as a great advantage — what better way to know if your service works in production than, well, testing it there?
The best way to know if code works in production is to test it there.
Some automated tests run also in production in the form of semantic monitoring, though the big bulk of them run as part of the continuous integration pipeline.
Many of these tests are written in the form of functional tests, especially when talking about APIs, and for many of them the service integrates directly with their production counterparts instead of mocking the dependencies. For write APIs, it is important to support test data across the different funnels, thus allowing to test without interfering with our users.
Also, our micro-frontends engine can be run locally in a way in which one or many of the micro-frontends are resolved against locally running services, while keeping everything else pointing to production. This enables us to locally test UI changes in an environment that is as close to production as possible.
Feature toggles #
Another important thing to deal with is that while we are manually testing our changes in production, we do not want our users to get exposed to the new code. To achieve that we use the well-known technique of feature toggles.
Our tool of choice for this is Toguru. It allows to override individual toggles via browser query parameters or cookies, so engineers can easily run their manual and automated tests before turning a certain toggle on for the general public.
Shadow traffic #
Last but not least, we also send shadow traffic to new services or features when possible. This allows us to gain confidence about its performance and behaviour under real traffic, both in volume and shape.
This means that a service needs to be accessible as early as possible, though probably hidden to users. Once that’s the case, traffic can be sent to the service, getting early insights around performance, monitoring, etc.
With this article I tried to give you a good understanding of how we release new features at AutoScout24, but more importantly, why we do it like we do. Like many others, I also found it surprising when I joined the company but now I couldn’t think of doing it any differently!
If you want to experience techniques like this in an encouraging environment and surrounded with great people, get in touch.
Principal Software Engineer at Autoscout24