A New ML Stack

It's time for a new ML stack -- for software engineers.

Software engineers are one of the fastest-growing user segments with LLMs and other foundational models.

It's never been easier to integrate models directly at the application layer (or at the edge). Developers don't need to craft n-dimensional arrays or complicated embeddings, it's just some text or an image.

Other parts of the machine learning stack of the 2010s are quickly drifting into irrelevancy.

Model architecture changed. Purpose-built platforms and products for deep learning didn’t necessarily translate when model architecture changed. Early ML products built at Airbnb and Uber didn’t see widespread adoption among the next generation of growth startups. What's feature engineering?
Cloud infrastructure evolved. Better distributed computing primitives through open-source libraries (e.g., Kubernetes) and the near-infinite scalability of compute and storage if properly separated. Instead of HPC, OpenAI uses Kubernetes.
The “Modern Data Stack” saw much more activity than the MLOps space. Even the largest player (Databricks) gravitated towards a platform built around a cloud data warehouse. But will foundational models be part of the MDS? Or will engineering teams adopt them directly? The shortest distance between two points is a line.
Supervised learning is still important, but the GPT-class models learn unsupervised. Labeling data will always be important, but it might not be the same barrier to entry as it was in the Scale AI era. What's step zero of the foundational model workflow?

What stayed the same? The developer tools and libraries — TensorFlow, PyTorch. Although even these have started to be consolidated into bigger building blocks (e.g., HuggingFace’s transformers library). Generally, these libraries have absorbed new developments fairly easily, and the primitives that they provide (abstracting the underlying hardware, matrix multiplication, etc.) are generic enough.

Notebooks are still the place for data exploration (and even exploratory training). However, now they are much more likely to be remote (Google Colab or hosted on a cluster). Models continued to grow larger. But experimentation for developers might happen right in the IDE.

Of course, it’s not just the ML stack. Software half-life seems to be decreasing rapidly across the board — even for infrastructure.