Should Your Developers Work in the Cloud?

The more your development environment looks like what you're ultimately deploying, the fewer headaches your team is going to have.

When using Kubernetes, you have a few different options of how you could have your developers work. I've built developer tools across the whole spectrum and here are some benefits and drawbacks I've seen to each.

Build Local, Run Local [without Docker or Kubernetes]

Benefits:

  • No upfront migration: Continue to develop your applications exactly how you did before
  • No learning curve: Developers don't have to learn new tooling
  • Usually quick: Once development environment is set up and dependencies are downloaded, builds are usually quick. Can leverage native tools: compiler level caching, running on local ports, and instant changes (no networked filesystem, ssh tunnel, etc.)

Downfalls

  • Parity between environments: Significant departure from how services are actually ran. Many places that things can go wrong.
  • "Works on my machine": Setting up a developer environment needs to be done on a per-user basis.
  • Single platform development: Development OS cannot be different from runtime environment (e.g. can't develop on a pixelbook or MacBook and deploy to a linux environment.)

Build Local, Run Local [with Docker and Kubernetes]

Benefits

  • Closer to Production: Fewer differences with higher environments. Developers can catch issues in development rather than waiting for CI or QA or catch them.
  • Portable: You can run Docker on Kubernetes on every major OS.
  • Declarative environment: Setup and teardown development environments easily. No need for long developer environment setup documents. Applying the configuration for a cluster can be as easy as kubectl apply -f folder/.
  • Reproducible: Alongside declarative environments, bugs and other issues are easier to reproduce because Docker and Kubernetes manage the immediately dependencies for an application.
  • Full Control: Developers manage the entire stack and therefore have few limitations when developing.

Drawbacks

  • Limited: Environment may be too large to run on your workstation. Istio suggests 8GB and 4 vCPUs on minikube. Won't work for users with high data or compute requirements (e.g. ML workloads)
  • Ops work for the Developer: Developers have to manage a local cluster. Minikube and Docker for Desktop provide one-click cluster setup, but what happens when your cluster goes down? Networking issues, OOM errors, and more can require developer intervention.

Build Local, Deploy Remote [with Kubernetes]

Benefits

  • Closest to Production: While it doesn't really matter what guest OS Docker uses, Kubernetes still has many host dependencies with the kubelet, which doesn't run containerized. A Kubernetes feature might work on Docker for Desktop or minikube's custom VM image but not the one your production cluster.
  • More Portable: You can run Docker on every major OS.
  • Managed Declarative environment: Have your ops team manage the cluster, instead of the developers. Manage O(orgs) clusters, not O(developers).
  • Can support arbitrarily large environments
  • Can be shared by multiple users
  • Can utilize ops-managed resources (dashboard, logging, monitoring, specialized hardware like TPUs)

Drawbacks

  • Cost: You have to buy hardware for your developers anyways
  • Speed: Build artifacts can be large, and it takes time to move large objects across a network.
  • New Development Tools: Apps aren't deploy to localhost by default like they might be locally.
Fast Kubernetes Development with File Sync and Smart Rebuilds

What if I told you that you didn't have to rebuild your docker images every time you made a change?

I'm happy to share with you a feature I added in the last release of skaffold that instantly syncs files to your running containers without any changes to your deployments or extra dependencies.

You can get it today with skaffold 0.16.0


All you need for this tutorial is a running Kubernetes cluster and kubectl.

I'm going to be:

  1. Creating a Flask python app
  2. Dockerizing it
  3. Kuberneterizing it
  4. Watching my changes instantly get reflected in the cluster

If you'd prefer to just clone the repository, you can get these 4 files at https://github.com/r2d4/skaffold-sync-example

Creating the Flask App

app.py

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, World from Flask!'

Dockerizing it

Dockerfile

FROM python:3.7-slim

RUN pip install Flask==1.0
COPY *.py .

ENV FLASK_DEBUG=1
ENV FLASK_APP=app.py
CMD ["python", "-m", "flask", "run"]

The two environment variables tell flask to print stack traces and reload on file changes.

  • I used python-slim to work with a smaller image
  • With more than one requirement, you'll want to create a separate requirements.txt file and COPY that in. We're only using flask so I kept it simple here.
  • Did you know? That before Docker 1.10 ENV and other commands used to create layers. Now, only RUN, COPY, and ADD do. So go ahead and add those cheap commands to the end of your Dockerfile.

Kuberneterizing it

k8s-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: python
spec:
  containers:
  - name: python
    image: gcr.io/k8s-skaffold/python-reload
    ports:
    - containerPort: 5000

If Kubernetes were carbon based life, the Pod would be the atom. If you're not using minikube or Docker for Desktop, you're going to need to change that image name to something you can push to.

The Magic

skaffold.yaml

apiVersion: skaffold/v1alpha4
kind: Config
build:
  artifacts:
  - image: gcr.io/k8s-skaffold/python-reload
    sync:
      '*.py': .
deploy:
  kubectl:
    manifests:
    - k8s/**

This is the last YAML file I'm going to make you copy, I swear.

The magic here is the "sync" field, that tells skaffold to sync any python file to the container when it changes.

Make sure the image name matches the image name you used above if you changed it.

  • Did you know? Skaffold supports building a few types of "artifacts" other than Dockerfiles. Anything that produces a Docker image.
  • I used a glob pattern in the deploy part of the config, and when new Kubernetes manifests are added, skaffold will be smart enough to redeploy.
  • Skaffold can also detect any changes in the skaffold.yaml itself and reload

Development

Run

skaffold dev

You should see some output ending with

$ skaffold dev
...
Port Forwarding python 5000 -> 5000
[python]  * Serving Flask app "app.py" (lazy loading)
[python]  * Environment: production
[python]    WARNING: Do not use the development server in a production environment.
[python]    Use a production WSGI server instead.
[python]  * Debug mode: on
[python]  * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
[python]  * Restarting with stat
[python]  * Debugger is active!
[python]  * Debugger PIN: 289-130-309

Follow the link to http://127.0.0.1:5000/. But, isn't my application running on Kubernetes, in Docker, possibly in a VM on my computer or in the cloud?

Yep. Skaffold is smart enough to port-forward any ports in your deployments to your laptop. You don't have to worry about exposing your development environments to the internet just to ping a URL. The connection is secure between you and your cluster.

Go ahead and make some changes to your Flask app. Whatever you want: change the message, add more routes, add more files python files, delete some files.

Now check the output on your skaffold dev terminal.

....
Synced files for gcr.io/k8s-skaffold/python-reload:dirty-2db9f3d...
Copied: map[app.py:app.py]
Deleted: map[]
Watching for changes...
[python]  * Detected change in '/app.py', reloading
[python]  * Restarting with stat
[python]  * Debugger is active!
[python]  * Debugger PIN: 289-130-309
...

If you visit http://127.0.0.1:5000/, you'll see the changes that you made to your image, nearly instantly.

$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
python    1/1       Running   0          6m

If you want to see a Node JS example in action, I've added this example and another in the official skaffold repository.

Offline is a Feature, Not a Product

This argument misses the fundamental point of offline. Offline is a feature for development tools, added in pursuit of the real prize: faster development cycles. Everything else is a means to that end - caching, sync, and running the stack locally.

And it turns out the best way to speed up developer cycles is to do the exact opposite.

We should be exploring what collaborative workflows we can enable in an online, cloud-native world.

Git was fundamentally transformed when developers could share and collaborate on repositories through GitHub. Linux containers were brought into mainstream when they could be packaged in Docker containers shared through a Docker Registry. Cloud has changed the way that organizations run their infrastructure, and the next frontier is changing how programmers develop.

IDEs used to be heavyweight platform-specific tools that couple the development runtime to the execution runtime. Now they are cross-platform, lightweight, and increasing more operations are being pushed to the cloud: builds, code search, language servers and autocomplete, CI and static analysis.

Even deploying your code used to require developers to run an entire stack locally. What if all the developers in a team shared a Kubernetes cluster? Docker abstracts the platform, and Kubernetes abstracts the environment. Go faster when each developer doesn't need to administer a cluster themselves, and install core dependencies like Istio and Knative.

And most airplanes offer WiFi now anyways.