I’m looking into profile guided optimisation (PGO) in GCC as a future topic for the Linaro Toolchain team. PGO works by having you build your program twice: once to instrument and record what the program actually does and then again using that profile to better optimise.
One optimisation is to track the values used in a function and special case the most frequent one. I was quite impressed with what GCC currently does:
- Rewrite divides and modulos: change a = b / c to if c == N then a = b / N else a = b / c
- Rewrite modulo a power of two: change a = b % c to if c == N and N is a power of 2 then a = b % N else a = b % c
- Rewrite an indirect call to direct: change (*callback)() to if callback == N then N() else (*callback)()
- Rewrite string operations of known length: change memcpy(a, b, c) to if c == N then memcpy(a, b, N) else memcpy(a, b, c)
GCC's later optimisations can then improve the special cases even further, such as changing a divide by a power of two to a shift or inlining the memcpy() completely instead of doing a function call.