Every Sufficiently Advanced Configuration Language is Wrong

Jun 20, 2022

Every sufficiently advanced configuration language is the wrong tool for the job.

For basic configuration, YAML or JSON is usually good enough. It falls apart when you try to do more:

  • Template it with a templating engine
  • Use esoteric language features to reuse code (anchors and aliases)
  • Patch or modify it with something like JSONPatch
  • Type-check or schema validate

These are anti-patterns and often cause more issues than they solve. So instead, we develop more advanced configuration languages that aim to solve many of the problems that we duck-tape with YAML or aren't possible to express in YAML.

  • Eliminate duplication with object orientation (Jsonnet, GCL)
  • Schema definition and data validation (CUE)
  • Modules, packages, and inheritance (CUE)
  • Scripting, Control flow  (Dhall)

The logical extreme is becoming more evident – advanced configuration in general-purpose programming languages. You can see this in the emergence of Typescript for Infrastructure-as-Code. For the most basic (and human 0x777) configuration needs, there will always be simple formats – YAML, JSON, INI, etc.).

For everything else, general-purpose languages will win out.

  • A variety of type systems (e.g., static vs. dynamic, nominal vs. structural) can fit every use case.
  • Inheritance and package management are already built-in. Schema definition and validation can live alongside schema usage.
  • Configuration can be unit or integration tested.
  • No toolchain sprawl – developers don't need to learn a new language or download new tools.
  • Can utilize schema definitions from APIs. In the compiled configuration stack, you benefit by having built-in intellisense schema usage and discovery.

But we're not quite there yet. Typescript is one of the most promising languages for infrastructure (and, soon, language-agnostic project configuration). What else needs to happen before we see widespread configuration-as-code.

  • Configuration in code needs to be easily embeddable into existing projects without boilerplate. YAML requires a few lines of parsing code. Typescript requires a package.json, tsconfig, etc. WebAssembly offers an exciting path to embed interpreters in other LLVM-based languages (like Go, Rust, Python, etc.).
  • Languages need general-purpose declarative constructs built-in to support constrained configuration sets. For example, AWS has aws/constructs, which serves as the basis for AWS CDK. We'll need more like this (and maybe even implement other APIs in it, see this RFP).
  • Advanced configuration tooling needs to be built for developers, not operators. The era of Chef, Puppet and other DevOps tooling that leaned heavily towards Ops is ending. Developers can write their code, write their own configuration, and deploy their own apps.
  • Not all general-purpose languages are good fits for configuration. For example, Go can be excessively verbose for configuration. Some language type systems will be a better fit than others for specific projects.

Some common objections:

  • But $language is not declarative or reproducible. Every project lives on the Spectrum of Reproducibility, and you can develop frameworks that satisfy your team's specific requirements (e.g., no external dependencies, byte-for-byte reproducibility, or whatever else.)
  • We want configuration to be editable without writing code. Writing lines of Jinja templates is often more complex and less maintainable than the equivalent in code. The further the resulting configuration drifts from the static representation, the more a general-purpose language is useful.