Benchmarking .NET code

TL;DR:

Benchmarking correctly is hard.
BenchmarkDotNet makes it much easier.

Predicting Performance From Code Is Hard

Telling faster from slower code has become increasingly difficult with high-level languages and technological advances in hardware.
As C# developers in 2020 we write code which will be statically compiled into IL code which will be dynamically compiled at runtime by a JIT compiler into machine code, possibly multiple times with varying degrees of optimization (Tiered Compilation).
This machine code will then run on a branch predicting, register renaming, out-of-order executing, superscalar, hyperthreaded, clock frequency boosting, multi-core CPU with 3+ layers of memory caches.

C# code is so far removed from what is actually going to happen in silicon at runtime that reasoning your way through performance analysis by just looking at the code can become a cumbersome if not nigh impossible endeavor.

The solution is to do what any good scientist does to check their hypothesis: measure.
When results cannot with reasonable effort be predicted from your actions (in the CYNEFIN framework this would be on the left side), it is important to have a platform that lets you experiment frequently and quickly.
BenchmarkDotNet is such a platform for performance optimization.

Stopwatch Will Not Do

But why should we use a 3rd-party tool for this?
In the end there is an input and an output and some time passes while the code determines the latter from the former.
Simple, right?
Unfortunately, measuring .NET code performance is non-trivial to do accurately and reliably as there are many pitfalls.
Some are fairly obvious (without a dry run the initial JIT compilation will skew your measurements), some are obscure (just calling Stopwatch before your benchmarked code can greatly change its measured performance).

BenchmarkDotNet Gets It Right

Enter BenchmarkDotNet, the current gold standard of .NET (micro-) benchmarking, supported by the .NET Foundation.
It will not only save you from mentioned pitfalls, but make benchmarking automated, easy, quick and fun.
In addition it makes performance analyses from different teams consistent by providing a common basis.
If you are doing performance optimization, it provides an almost unit-test-like quick feedback cycle.

This enables you to make small code changes iteratively while measuring performance after each change to find out which changes increase speed and which don’t or even decrease speed (prepare to be surprised).

BenchmarkDotNet will

  • check that your benchmarked code was indeed compiled with optimizations turned on as a sanity check
  • call your code enough times per measurement iteration to make time measurement accurate with respect to the available clock resolution
  • measure as many iterations as is necessary for statistically significant results
  • remove outliers (from that virus scan or that browser popup that happened to disturb your benchmark)
  • warn you about strange results (too many outliers, multimodal distribution etc.) that could hint at problems with your benchmarking method
  • output a report as html, md and csv for consumption by human eye, Excel etc.

But Wait, There’s More

In addition there are so-called Diagnosers that can add to the report

  • the IL and ASM generated by the JIT
  • a trace file with profiling information for consumption with PerfView or similar software
  • GC stats
  • hardware counter info about branch prediction misses and cache misses
  • and much more

I highly recommend at least skimming https://benchmarkdotnet.org/ or their github page to learn more.

Using BenchmarkDotNet for performance optimization

There will be another shorter blog post soon about my experiences with performance optimization on the CPU guided by benchmarking with BenchmarkDotNet.