Stochastic/Deterministic

Apr 18, 2023

For the last few decades of computing, the focus has been on deterministic behavior. As systems grew larger, we needed more ways to eliminate classes of non-deterministic bugs. For instance, you might perform a software build and receive different checksums given the same inputs. System time, non-deterministic file-ordering, or random number generation could lead to differences. Most of the time, these things don’t matter, but when they do, these bugs are incredibly costly to find and fix.

Generative AI introduces much more stochasticity into programming. For example, it might generate code to run or stitch different services together. It’s a significant paradigm shift and opens up an entirely new class of software to be built.

Ironically, this makes the deterministic parts that much more important. Bit-reproducible builds (e.g., Bazel, Pants, Buck) were only important to a small subset of companies that ran software at an immense scale. Most organizations didn’t have enough randomness to make the tradeoff between reproducibility and extra work. But now, anything that touches generative AI interfaces needs as much determinism as possible.

Practically, that means:

  • Using code over natural language in as many places as possible. Code can only be interpreted in a single way (given the right deterministic toolchain).
  • Reproducibility in every part of the toolchain — reproducible environments, builds, tools, and workflows.
  • Version and change control
  • Declarative interfaces (versus imperative commands)
  • Hermeticity
  • Functional programming.