#Performance

2 articles

Profile-Guided Optimization Made Our Code Slower

That's the whole story. I took a virtual-dispatch interpreter loop — the textbook PGO target — instrumented it, trained it on a representative workload, and recompiled. Both GCC 15.2.1 and Clang 21.1.

Modern C++ // dev Apr 20, 2026 8 min read

Cache-Line Archaeology: Finding and Fixing False Sharing in Production

False sharing is a measurable, fixable performance bug that hides in struct layouts. Two atomic counters in the same cache line can cost you 6x throughput — and perf c2c finds it in seconds.

Modern C++ // dev Apr 19, 2026 9 min read