A New ML Stack

Jan 2, 2023

Parts of the machine learning stack of the 2010s are quickly drifting into irrelevancy.

  • Model architecture changed. Purpose-built platforms and products for deep learning didn’t necessarily translate when model architecture changed. Early ML products built at Airbnb and Uber didn’t see widespread adoption among the next generation of growth startups.
  • Cloud infrastructure evolved. Better distributed computing primitives through open-source libraries (e.g., Kubernetes) and the near-infinite scalability of compute and storage if properly separated.
  • The “Modern Data Stack” saw much more activity than the MLOps space. Even the largest player (Databricks) gravitated towards a platform built around a cloud data warehouse.
  • Supervised learning is still important, but the GPT-class models learn unsupervised. Labeling data will always be important, but it might not be the same barrier to entry as it was in the Scale AI era.

What stayed the same? The developer tools and libraries — TensorFlow, PyTorch. Although even these have started to be consolidated into bigger building blocks (e.g., HuggingFace’s transformers library). Generally, these libraries have absorbed new developments fairly easily, and the primitives that they provide (abstracting the underlying hardware, matrix multiplication, etc.) are generic enough.

Notebooks are still the place for data exploration (and even exploratory training). However, now they are much more likely to be remote (Google Colab or hosted on a cluster). Models continued to grow larger.

Of course, it’s not just the ML stack. Software half-life seems to be decreasing rapidly across the board — even for infrastructure.

Subscribe for daily posts on startups & engineering.