pytest daemon: 10X Local Test Iteration Speed

At Discord, we utilize a Python monolith to power our API, from sending messages to managing Nitro subscriptions. To support this, we use pytest to write and run unit tests.

Over the last 8 years, the time it takes to run a single test has continuously grown until it reached a point where it takes a whopping 13 seconds to run a single test.

To clarify, even if the test ends up doing absolutely nothing, 13 seconds is the bare minimum it takes:

Most of the testing time is spent in our global conftest.py file, which contains slow imports and fixtures with scope=session. We refer to this internally as "importing the universe," and to say it straight: the lack of dependency boundaries in this project is an issue that’s worsened with time. We’re actively working to address this in the long term by breaking our monolith into modules, reducing the number of imports & fixtures required to run a single test.

However, since running a single unit test happens frequently, we started looking into potential stop-gap solutions.

Why is it important?

Simply put: humans get distracted. When I’m waiting for a test for more than a few seconds, I might click on a notification, open my browser, or get distracted for a few minutes. These distractions extend the feedback loop and make me less efficient in completing my current task.

An XKCD comic. Two engineers are swordfighting, looking as if they're not working. The caption reads: "The #1 programmer excuse for legitimately slacking off: 'My code's compiling.'" — *https://3d.xkcd.com/303/*

Solution: Wait ahead of time!

After tinkering for a bit, I came up with a simple but imperfect solution that can speed things up significantly which I call “pytest daemon.” The gist of the approach is to have an already-loaded process on standby, letting us run any test quickly without having to wait for a new Python process to import.

Our daemon manager watches for any code changes, and if something changes, a new process is spawned.

A flowchart that goes as such: "Dev -> Test -> daemon manager -> daemon"

The difference between utilizing a vanilla test and running a test with the daemon involves replacing the regular pytest command: you use a script that transforms the arguments into a REST HTTP request. This is then sent to the daemon manager, which proxies the request to the active daemon.

Here be dragons: Hot reloading

So, it’s time to get to work: let’s assume we have a process ready to run some tests, but then we end up making one last change — should we wait a whole 13 seconds for a new process to start? Pfft… nahh! We can call upon the dark arts of:

To utilize our tactic, we’ll need to determine which modules need to be reloaded and the order they should be reloaded. For instance, if we have the following imports:

A flowchart with two paths. One goes: "Test -> import -> view -> import -> model." The other goes: "test -> import - model"

If test.py is modified, we only need to reload it.
If view.py is modified, we need to reload view.py, test.py, in that order.
if model.py is modified, we need to reload model.py, view.py, test.py, in that exact order.

To know which files to load, our daemon manager starts by analyzing all our source code:

Build an import graph for our code, where each node is pointing to all the files that directly import it (reversing the import graph):

A flowchart with two paths. One goes: "model -> dependancy -> test." The other goes "model -> dependancy -> view -> dependancy -> test."

Create a topological sort of the dependencies graph from step one. For example: [model.py, view.py, test.py]
When a file is modified, use the graph from step one to get a mapping from each python file to all other files that import it directly. In case we have multiple files, we sort them based on their topological sort from step two.

💡 Without using topological sort, we might create the wrong reload order. Consider what will happen if model is modified and we choose to reload using the following order: model, test, view? Now, when executing importlib.reload(test) it will keep a reference to an old version of view which, in turn, still uses an obsoleted version of model.

Reloading the files in the right order doesn’t solve all of our issues. Consider the following task decorator, which has the below sanity check:

In the example above, reloading task.py will raise an exception! To work around these kinds of issues, we added a patching point:

To fix the example above, we can implement the following patch:

Hard as we try, there might be cases where hot-reloading fails or where it’s simply not worth the effort to work around these failures, which is why we always start a new, clean process in parallel to our reloading attempts. If our new method doesn’t work, we still have ol’ reliable within at most 13 seconds from the last save.

How Much Faster Is It?

It’s so fast. Just look at these results:

A table comparing two results. One row reads: "empty test with daemon, .4 seconds, without daemon, 12.7 seconds." The second row reads "complex test with many dependancies, with daemon 3.9/2.4/2.4 seconds, without daemon 17.5 seconds."

A few notes when looking at these results:

Running the daemon multiple times can improve results in case additional one-time work was triggered, for example, if additional modules were loaded.
Running our “empty test” on vanilla python without any imports or conftest.py takes 0.6 seconds, which means our stop-gap solution is faster than we could ever get by removing unneeded imports or fixtures.

Looking at aggregated runs of our users in a given month, we found that we reduced the median test duration from 20 seconds down to 2 seconds!

VS Code Integration

Using pytest is well supported by popular IDEs like VS Code. However, our pytest daemon breaks some of its functionality. To address this, we developed a small plugin that provides the following features:

Test the current method.
Run all tests in the current file.
Repeat the last run.
Start the daemon manager.
Force a full reload of the daemon in case of any issues.

While our plugin may not be as good as the built-in integration, it’s a close approximation. A speed trade-off is worth sacrificing some fancy features.

It’s important to note that debugging still works seamlessly. By debugging the daemon manager, VS Code automatically attaches to any child process spawned, with all breakpoints functioning as expected.

Open Source Projects

While drafting this blog post, I searched around the web to see if anyone else was having a similar issue and found an open source project that implemented a similar approach. It even used the same name!

Both projects were created independently around the same time, which is pretty cool. Great minds think alike!

If you run into a similar issue, consider checking their work out: https://github.com/JamesHutchison/pytest-hot-reloading/

Side Quest Conclusion

Our 15 seconds import time is still there, but now we can focus on improving it properly, with pytest daemon being the stop-gap solution we needed.

One of our Company Principles is Progress Over Perfection:

All big things start small. Think long-term and break ideas down so you can start delivering value and learning right away. Strive for an 80/20 approach and compound from there. This is the essence of moving fast with both excellence and surprise & delight.

Building the first iteration of pytest daemon was about a week of work and it was a fun challenge that supported our goal of helping our engineers move faster! My initial version was published as an opt-in version our engineers could use, and after incorporating their feedback, we started migrating most of our local dev flows to use it.

I keep iterating on it every few weeks to see what small improvements would make our engineers move faster or resolve any issues preventing certain teammates from leveraging it).

Speaking of moving faster, I used ChatGPT4 and GitHub Copilot to speed up my progress. Specifically:

Building my first VSCode extension was straightforward.
Asking ChatGPT to write a script to analyze all the imports in our code generated a relatively good starting point.

If improving developer environments work excites you check out our careers page. At the time of publishing, we’re actively hiring for our developer environments team!