Should Your Developers Work in the Cloud?

The more your development environment looks like what you're ultimately deploying, the fewer headaches your team is going to have.

When using Kubernetes, you have a few different options of how you could have your developers work. I've built developer tools across the whole spectrum and here are some benefits and drawbacks I've seen to each.

Build Local, Run Local [without Docker or Kubernetes]

Benefits:

  • No upfront migration: Continue to develop your applications exactly how you did before
  • No learning curve: Developers don't have to learn new tooling
  • Usually quick: Once development environment is set up and dependencies are downloaded, builds are usually quick. Can leverage native tools: compiler level caching, running on local ports, and instant changes (no networked filesystem, ssh tunnel, etc.)

Downfalls

  • Parity between environments: Significant departure from how services are actually ran. Many places that things can go wrong.
  • "Works on my machine": Setting up a developer environment needs to be done on a per-user basis.
  • Single platform development: Development OS cannot be different from runtime environment (e.g. can't develop on a pixelbook or MacBook and deploy to a linux environment.)

Build Local, Run Local [with Docker and Kubernetes]

Benefits

  • Closer to Production: Fewer differences with higher environments. Developers can catch issues in development rather than waiting for CI or QA or catch them.
  • Portable: You can run Docker on Kubernetes on every major OS.
  • Declarative environment: Setup and teardown development environments easily. No need for long developer environment setup documents. Applying the configuration for a cluster can be as easy as kubectl apply -f folder/.
  • Reproducible: Alongside declarative environments, bugs and other issues are easier to reproduce because Docker and Kubernetes manage the immediately dependencies for an application.
  • Full Control: Developers manage the entire stack and therefore have few limitations when developing.

Drawbacks

  • Limited: Environment may be too large to run on your workstation. Istio suggests 8GB and 4 vCPUs on minikube. Won't work for users with high data or compute requirements (e.g. ML workloads)
  • Ops work for the Developer: Developers have to manage a local cluster. Minikube and Docker for Desktop provide one-click cluster setup, but what happens when your cluster goes down? Networking issues, OOM errors, and more can require developer intervention.

Build Local, Deploy Remote [with Kubernetes]

Benefits

  • Closest to Production: While it doesn't really matter what guest OS Docker uses, Kubernetes still has many host dependencies with the kubelet, which doesn't run containerized. A Kubernetes feature might work on Docker for Desktop or minikube's custom VM image but not the one your production cluster.
  • More Portable: You can run Docker on every major OS.
  • Managed Declarative environment: Have your ops team manage the cluster, instead of the developers. Manage O(orgs) clusters, not O(developers).
  • Can support arbitrarily large environments
  • Can be shared by multiple users
  • Can utilize ops-managed resources (dashboard, logging, monitoring, specialized hardware like TPUs)

Drawbacks

  • Cost: You have to buy hardware for your developers anyways
  • Speed: Build artifacts can be large, and it takes time to move large objects across a network.
  • New Development Tools: Apps aren't deploy to localhost by default like they might be locally.
Fast Kubernetes Development with File Sync and Smart Rebuilds

What if I told you that you didn't have to rebuild your docker images every time you made a change?

I'm happy to share with you a feature I added in the last release of skaffold that instantly syncs files to your running containers without any changes to your deployments or extra dependencies.

You can get it today with skaffold 0.16.0


All you need for this tutorial is a running Kubernetes cluster and kubectl.

I'm going to be:

  1. Creating a Flask python app
  2. Dockerizing it
  3. Kuberneterizing it
  4. Watching my changes instantly get reflected in the cluster

If you'd prefer to just clone the repository, you can get these 4 files at https://github.com/r2d4/skaffold-sync-example

Creating the Flask App

app.py

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, World from Flask!'

Dockerizing it

Dockerfile

FROM python:3.7-slim

RUN pip install Flask==1.0
COPY *.py .

ENV FLASK_DEBUG=1
ENV FLASK_APP=app.py
CMD ["python", "-m", "flask", "run"]

The two environment variables tell flask to print stack traces and reload on file changes.

  • I used python-slim to work with a smaller image
  • With more than one requirement, you'll want to create a separate requirements.txt file and COPY that in. We're only using flask so I kept it simple here.
  • Did you know? That before Docker 1.10 ENV and other commands used to create layers. Now, only RUN, COPY, and ADD do. So go ahead and add those cheap commands to the end of your Dockerfile.

Kuberneterizing it

k8s-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: python
spec:
  containers:
  - name: python
    image: gcr.io/k8s-skaffold/python-reload
    ports:
    - containerPort: 5000

If Kubernetes were carbon based life, the Pod would be the atom. If you're not using minikube or Docker for Desktop, you're going to need to change that image name to something you can push to.

The Magic

skaffold.yaml

apiVersion: skaffold/v1alpha4
kind: Config
build:
  artifacts:
  - image: gcr.io/k8s-skaffold/python-reload
    sync:
      '*.py': .
deploy:
  kubectl:
    manifests:
    - k8s/**

This is the last YAML file I'm going to make you copy, I swear.

The magic here is the "sync" field, that tells skaffold to sync any python file to the container when it changes.

Make sure the image name matches the image name you used above if you changed it.

  • Did you know? Skaffold supports building a few types of "artifacts" other than Dockerfiles. Anything that produces a Docker image.
  • I used a glob pattern in the deploy part of the config, and when new Kubernetes manifests are added, skaffold will be smart enough to redeploy.
  • Skaffold can also detect any changes in the skaffold.yaml itself and reload

Development

Run

skaffold dev

You should see some output ending with

$ skaffold dev
...
Port Forwarding python 5000 -> 5000
[python]  * Serving Flask app "app.py" (lazy loading)
[python]  * Environment: production
[python]    WARNING: Do not use the development server in a production environment.
[python]    Use a production WSGI server instead.
[python]  * Debug mode: on
[python]  * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
[python]  * Restarting with stat
[python]  * Debugger is active!
[python]  * Debugger PIN: 289-130-309

Follow the link to http://127.0.0.1:5000/. But, isn't my application running on Kubernetes, in Docker, possibly in a VM on my computer or in the cloud?

Yep. Skaffold is smart enough to port-forward any ports in your deployments to your laptop. You don't have to worry about exposing your development environments to the internet just to ping a URL. The connection is secure between you and your cluster.

Go ahead and make some changes to your Flask app. Whatever you want: change the message, add more routes, add more files python files, delete some files.

Now check the output on your skaffold dev terminal.

....
Synced files for gcr.io/k8s-skaffold/python-reload:dirty-2db9f3d...
Copied: map[app.py:app.py]
Deleted: map[]
Watching for changes...
[python]  * Detected change in '/app.py', reloading
[python]  * Restarting with stat
[python]  * Debugger is active!
[python]  * Debugger PIN: 289-130-309
...

If you visit http://127.0.0.1:5000/, you'll see the changes that you made to your image, nearly instantly.

$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
python    1/1       Running   0          6m

If you want to see a Node JS example in action, I've added this example and another in the official skaffold repository.

Unit Testing with the Kubernetes Client Library

How do you unit test code that makes Kubernetes API calls?

Using the Kubernetes client library can help you mock out a cluster to test your code against.

As one of the first consumers of the kubernetes/client-go library when building kubernetes/minikube, I built elaborate mocks for services, pods, and deployments to unit test my code against. Now, there's a much simpler way to do the same thing with significantly fewer lines of code.

I'm going to be showing how to test a simple function that lists all the container images running in a cluster. You'll need a Kubernetes cluster, I suggest GKE or Docker for Desktop.

Setup

Clone the example repository https://github.com/r2d4/k8s-unit-test-example if you want to run the commands and follow along interactively.

main.go

package main

import (
	"github.com/pkg/errors"
	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/client-go/kubernetes/typed/core/v1"
)

// ListImages returns a list of container images running in the provided namespace
func ListImages(client v1.CoreV1Interface, namespace string) ([]string, error) {
	pl, err := client.Pods(namespace).List(meta_v1.ListOptions{})
	if err != nil {
		return nil, errors.Wrap(err, "getting pods")
	}

	var images []string
	for _, p := range pl.Items {
		for _, c := range p.Spec.Containers {
			images = append(images, c.Image)
		}
	}

	return images, nil
}

Writing the Tests

Let's start with a definition of our test cases, and some skeleton code for running the tests.

func TestListImages(t *testing.T) {
	var tests = []struct {
		description string
		namespace   string
		expected    []string
		objs        []runtime.Object
	}{
		{"no pods", "", nil, nil},
	}

	// Actual testing code goes here...
}

What's Happening

This style of writing tests is called "table driven tests" and in Go, this is the prefered style. The actual test code iterates over the table entries and performs the necessary tests. Test code is written once and used for each case. Some interesting things to note:

  • Anonymous struct to hold the test case definition. They allow us to define test cases concisely.
  • The Runtime Object Slice objs will hold all the runtime objects that want our mock API server to hold. We'll be populating it with some pods, but you can use any Kubernetes object here.
  • The trivial test case. No pods on the server shouldn't return any images.

Test Loop

Let's fill out the actual test code that will run for every test case.

	for _, test := range tests {
		t.Run(test.description, func(t *testing.T) {
			client := fake.NewSimpleClientset(test.objs...)
			actual, err := ListImages(client.CoreV1(), test.namespace)
			if err != nil {
				t.Errorf("Unexpected error: %s", err)
				return
			}
			if diff := cmp.Diff(actual, test.expected); diff != "" {
				t.Errorf("%T differ (-got, +want): %s", test.expected, diff)
				return
			}
		})
	}

Some interesting things to note:

  • t.Run executes a subtest. Why use subtests?
    • You can run specific test cases using the -run flag to go test
    • You can do setup and tear-down
    • And subtests are the entrypoint to running test cases in parallel (not done here)
  • Actual and expected results are diffed with cmp.Diff. Diff returns a human-readable report of the differences between two values. It returns an empty string if and only if Equal returns true for the same input values and options.

fake.NewSimpleClientset returns a clientset that will respond with the provided objects.
It's backed by a very simple object tracker that processes creates, updates and deletions as-is,
without applying any validations and/or defaults.

Test Cases

Let's create a pod helper function that will help provide some pods for us to test against. Since we are concerned about namespace and image, lets create a helper that creates new pods based on those parameters.

func pod(namespace, image string) *v1.Pod {
	return &v1.Pod{ObjectMeta: meta_v1.ObjectMeta{Namespace: namespace}, Spec: v1.PodSpec{Containers: []v1.Container{{Image: image}}}}
}

Let's write three unit tests. The first will just make sure that we grab all images if we use the special namespace value "" to list pods in all namespaces.

{"all namespaces", "", []string{"a", "b"}, []runtime.Object{pod("correct-namespace", "a"), pod("wrong-namespace", "b")}}

The second case will make sure that we filter correctly by namespace, ignoring the pod in wrong-namespace

{"filter namespace", "correct-namespace", []string{"a"}, []runtime.Object{pod("correct-namespace", "a"), pod("wrong-namespace", "b")}}

The third case will make sure that we don't return anything if there are no pods in the desired namespace.

{"wrong namespace", "correct-namespace", nil, []runtime.Object{pod("wrong-namespace", "b")}}

Putting it all together.

func TestListImages(t *testing.T) {
	var tests = []struct {
		description string
		namespace   string
		expected    []string
		objs        []runtime.Object
	}{
		{"no pods", "", nil, nil},
		{"all namespaces", "", []string{"a", "b"}, []runtime.Object{pod("correct-namespace", "a"), pod("wrong-namespace", "b")}},
		{"filter namespace", "correct-namespace", []string{"a"}, []runtime.Object{pod("correct-namespace", "a"), pod("wrong-namespace", "b")}},
		{"wrong namespace", "correct-namespace", nil, []runtime.Object{pod("wrong-namespace", "b")}},
	}

	for _, test := range tests {
		t.Run(test.description, func(t *testing.T) {
			client := fake.NewSimpleClientset(test.objs...)
			actual, err := ListImages(client.CoreV1(), test.namespace)
			if err != nil {
				t.Errorf("Unexpected error: %s", err)
				return
			}
			if diff := cmp.Diff(actual, test.expected); diff != "" {
				t.Errorf("%T differ (-got, +want): %s", test.expected, diff)
				return
			}
		})
	}
}