But much better than that was switching to using `gold` instead of the regular `ld` (mentioned in the "remarks" section near the bottom of the article); doing this brought my time down substantially further, from 5.8 seconds down to 1.6 seconds. Even without using -gsplit-dwarf, it brought the recompile+relink time down from the original 7.2 seconds to 2.6 seconds.
I guess my codebase might be hitting a particularly bad case for `ld`. I'm kind of startled to see such a large difference for what seemed like a throwaway extra suggestion at the end of the article!
Also note that not so long ago, ccache didn't support split dwarf. If you're using ccache, check that your version supports it.
lld spits out a bunch of warnings that neither ld nor gold cared about, about finding local symbols in the global symbol table of some of the third-party .so libraries I link against. Not sure to what extent I should care about that; the final build product does run the same as always.
And just so that nobody else needs to go searching; looks like ccache added support for split dwarf debugging data in its version 3.2.3, about three years ago. (although I haven't actually tried it to confirm, yet)
3:16 minutes -> 1:42 minutes: "We get a roughly 90% speedup in elapsed time."
What's the formula for the "speedup"? I think it would be more expressive if one would compare it in the same terms.
On this subject: https://randomascii.wordpress.com/2018/02/04/what-we-talk-ab...
1:42 -> 3:16: 90% slowdown?