Shift Left: Observability at the Edge

Logs are expensive in the cloud era. Collecting the telemetry is fairly cheap, but streaming terabytes across the network, storing them, and running continuous analyses on them gets expensive, quickly. Too many enterprises have been shocked at their monthly bill from Splunk or Datadog.

Legacy architectures like Splunk are already prohibitively expensive for enterprises (you can see this reflected in the company's performance – the stock is trading at pre-pandemic levels). Even as the company is shifting to a more cloud-native model, it continues to struggle. Other observability companies like Datadog face similar complaints: it's just too expensive (and maybe the wrong model). Yet, observability is even more important than ever. What's the path forward?

I believe we'll start to see a shift left to observability at the edge. Not all metrics will be continuously collected. Not all raw logs will find their way to the data warehouse. Instead of a continuous stream of everything to Datadog or Splunk, we'll see smarter agents (or agentless) do observability at the edge. These agents will determine (1) what (and when) metrics should be collected and (2) do basic analysis and report only higher-level metrics.

The result is the same level of observability, with less noise, with 10x lower cost. Less data over the network, less log spam in the database, and more signal to the noise. Maybe we'll see it built in an eBPF-based agent, or maybe it will be a agentless in-cluster collector and analyzer.