Interpretability in Complex Systems
Why interpretability is structural, not cosmetic, and what that demands from the systems we build.
Interpretability is often treated as a finishing touch — something added once the model works. I want to argue the opposite. Interpretability is a property of the system's architecture, not its presentation layer.
If we cannot trace why a system produced a particular output, we have not built a tool. We have built an oracle, and oracles are difficult collaborators.
The systems I trust most are the ones whose internals I can question. Not necessarily simple — but legible, in the way a well-structured proof is legible.