Infrastructure as SQL

Sep 17, 2021

If you can't beat 'em, JOIN 'em.

When infrastructure became truly planet-scale, we started caring more about infrastructure state rather than commands.

The solution was declarative configuration – (barely) human-readable code that helped describe our desired state to complex systems. DevOps engineers might think of all of the miles of YAML they have to configure. As our systems continued to grow larger and more complex, state became more complex to describe. Engineers were copying miles of YAML or doing complicated templating that made output artifacts more and more opaque.

When things become more complex, so turns the Heptagon of Configuration. Moving from static configuration, we started to turn to DSLs and code. The infrastructure-as-code movement lets us configure the desired state of these systems with Turing complete control flows like for loops and if statements. We can use higher-order programming language concepts like objects and inheritance.

But infrastructure-as-code can be difficult to understand, and not everyone. is a programmer. We've seen interesting trends from both sides software engineering stack – (Cloud) System Administrators may be experts in cloud configuration but not have deep software engineering skills. Data engineers might be SQL experts but not have strong algorithmic fundamentals.

So I saw a project that offered an interesting in-between: Cloud Infrastructure as SQL. The query language is actually not a bad replacement for the job to be done – declaratively defining a set of desired states.

A different way to look at it is configuration-as-data. Whether or not it will catch on depends on a lot. Who is managing the infrastructure? Software engineers prefer the flexibility and control of code.  Can the arbitrary transforms that we want to apply to sets of desired state be more easily described in SQL? Can we diff outputs and results more easily in SQL than JSON?