Live Programming
https://xkcd.com/303/

Every developer knows the pain of the inner development loop. Make code changes, rebuild, run tests, redeploy, and examine the differences. All developers do this - from frontend web developers to backend cloud infrastructure engineers. Engineers hate repetitive tasks and try to automate everything, so it's only natural that they would try to automate this loop as well.

I call it live programming, but it goes by many names: hot reloading, hot swapping, interactive programming. It's the process of automating and optimizing the build and deploy pipeline for developers to see the changes they make in code instantly. Three forces enable the live programming paradigm:

Moore's Law. We have more powerful computers that can compile code quicker.

Standardized tooling. Docker is a standard build and runtime target that enables us to automate build and deploy pipelines.

The rise of interpreted languages. Python and JavaScript provide a large userbase for live programming tools.

Live programming tools need three components.

  1. File-watcher. The value proposition of the tool is that the inner development loop no longer becomes a manual process, so there need to be events that trigger different stages. File changes are a logical entry point.
  2. Packager. Before, it wasn't easy to find a universal packager. That means that the previous generation of tools was language-specific - one for JavaScript, another for Ruby, another for Python, etc.
  3. Runtime. Not only was packaging language-specific, but runtime was as well. Static websites could get away with generic webservers, but other use cases needed language-specific servers that knew how to hot-reload classes and functions. Docker also changed this, providing a wrapper around language runtimes.

Here are some examples of live programming tools.

  • skaffold. I'm biased because I created the tool. But skaffold uses Kubernetes as a developer platform, automating compiling software, building docker images, and deploying to Kubernetes in a tight, iterative loop. The magic of skaffold is that it is the only tool that is genuinely full-stack. The code syncs to the running instance for interpreted or static code  (JavaScript, Python, CSS, etc.). For compiled code, rebuild and redeploy. And for configuration changes, a redeploy.
  • Gunicorn/Flask/Django. Live programming for python web developers.
  • webpack. The packager and development server for JavaScript and TypeScript. It also works for Ruby on Rails. While many other live programming tools evolved from runtimes, webpack evolved from packaging.
  • Pluto.jl. A notebook-like tool for Julia programmers that automatically updates all affected cells when a function or variable is changed.
  • Observable. A live programming environment for JavaScript, primarily focusing on visualizations and data analysis (from the creator of d3.js).
  • Excel. The original live programming environment. As cells change, Excel recalculates dependent cells. Developers and users can see each computation at intermediate steps.
Reducing Errors in Decision-Making
From Kahneman, Sibony, and Sunstein's book Noise

We get things wrong. But understanding the anatomy of the error can be more important than the judgment itself.

Errors can be thought of in two ways: bias and noise. Bias is when errors are in the same direction. Noise is variability in judgments that should be identical.

Noise can be good. Disagreements and contrarian thinking are essential ingredients to innovation—the market tests competing strategies. But there are many decisions where noise is a problem. Those of us who think analytically often believe that random errors cancel each other out. However, in performance ratings, judgment calls, and measurement, noise is highly detrimental to companies.

We can sometimes correct bias by examining the decision-making process.

What to look for in reducing decision-making bias?

  • Planning fallacy - Did people question the sources when they used data? How wide were confidence intervals for uncertain numbers?
  • Loss aversion - Is the risk profile of the decision-making team aligned with the company?
  • Present bias - Do the factors that led to the decision align with the short-term and long-term priorities of the company?
  • Anchoring - Were uncertain numbers a significant factor in the decision?
  • Nonregressive prediction - Did the decision-makers make insufficient allowance for regression towards the mean while predicting from an imperfectly reliable predictor?
  • Premature closure -Was there accidental bias in the choice of considerations discussed early on?
  • Excessive coherence - Were alternatives fully considered? Was uncomfortable data suppressed?
  • Initial prejudgments - Do any of the decision-makers stand to profit more from one conclusion than another?

But even we when eliminate bias, there still exists systemic noise in the system that causes wrong decisions. Kahneman describes two types of noise: level noise and pattern noise. Level noise is variation across individuals. In a performance review, some reviewers are more generous than others. That's level noise. Judgment scales are ambiguous ("on a scale of 1 to 10"), and words may mean different things to different people ("exceeds expectations"). Pattern noise is the difference in the personal responses of people to the same things. It could be due to differences in principles or values that a person holds, consciously or unconsciously.

How do you reduce noise?

  • Measure noise

What's measured gets managed. Kahneman and the authors did a study on the level of noise in a company. The executives estimated differences ranged from 5% to 10% in the organization. The results were shocking. They showed that the "noise index" ranged from 34% to 62%.

  • Structure judgments into several independent tasks

Divide and conquer. Breaking decisions up into independent tasks reduces the tendency for people to distort or ignore information that doesn't fit the emerging story. Structured interviews are a great way to put this into practice.

  • Resist premature intuitions

A decision made after careful consideration is always better than a snap judgment. Kahneman suggests that professionals shouldn't be given information that they don't need and could bias them, and calls this sequencing the information.

  • Favor relative judgments and relative scales

Scales that use comparisons are less noisy than absolute scales.

  • Obtain independent judgments from multiple teams, and then consider aggregating those judgments

Group discussions often create noise. Averaging different independent decisions will reduce noise but may not tackle bias.

These lists were taken from Kahneman, Sibony, and Sunstein's book Noise: A Flaw in Human Judgment.

First Principles
If you are to do important work then you must work on the right problem at the right time and in the right way. Without any one of the three, you may do good work but you will almost certainly miss real greatness.
Richard Hamming

There is no magic answer that will guarantee that all three conditions are satisfied. But, by making a series of 95% confidence bets, gathering more data along the way, you can quickly find seemingly impossible futures.

A first principle is something that cannot be deduced from any other axiom or assumption. First-principles thinking is about minimizing assumptions. Fewer assumptions mean less risk. So first principles thinking is intertwined with risk minimization. That makes first principles thinking a great decision-making framework.

We make better predictions with more data. Bayes' Theorem tells us how prior knowledge of conditions can affect the probability of an outcome. Machine learning models often do much better with more training data. First-principles thinking helps build a foundation by breaking down big decisions into a series of small but probable bets.

I didn't know what I wanted to do in undergrad and thought that anyone who thought they did was lying. But, I did know that STEM fields seemed to lead to more exciting outcomes - economists, physicists, mathematicians, computer scientists, engineers. So, I studied mathematics for maximum optionality. I could always go from math to physics or math to computer science easier than in the other direction. It was a small bet, considering the optionality. Even then, I hedged with classes in philosophy, history, and the classics.

After making that decision, the possibilities were narrower, but the choices were more straightforward. Where is the most important work happening? In the 1950s, the answer was physics, but now it's easily computer science.

Within computer science, what is here to stay? Programming languages go in and out of style. However, algorithms, data structures, and abstractions seemed to be foundational knowledge. That's why I stayed as far away from taking classes that required programming, electing to learn that on my own instead. The riskier bet is to know what languages or frameworks will exist in the future, so focus on the theory.

With a foundation in mathematics and computer science, the paths again narrow. Now, hardware or software?

I made a bet on software. It has more leverage, optionality and is non-rival. In addition, I could iterate faster in software rather than hardware, trying and testing new paths to find important work.

To find specialization in software, I asked myself, what is obvious? One hypothesis that seemed obvious is that nearly all companies will utilize cloud computing somehow in the future. This statement may be obvious to many, but it has profound implications. Cloud also has high optionality. Cloud covers many areas of software - operating systems, networking, distributed computing, databases, machine learning, and pretty much every other subfield of computer science.

What's the next logical decision after cloud? As you get higher up the stack, the decisions become individually riskier. However, with the correct foundation, the cost of a wrong leaf decision is lower.

A crucial part of Hamming's quote is getting the timing right. Decisions further up the stack have shorter half-lives (on the longevity of ideas, see The Lindy Effect). Platforms are the next logical step in my mind. Platforms are the act of codifying best practices into new abstractions. Gluing the building blocks together to make the theoretical model match the world. Unfortunately, platforms are more likely to be wrong and the half-life shorter.

But this is the benefit of foundational knowledge. A wrong bet can easily be pivoted into a correct guess the higher up the stack you are. If you pick the wrong platform, you are still directionally right with the cloud. And the knowledge of what doesn't work is valuable in making future decisions. So you're increasingly less likely to choose the wrong paths.

The series of bets results in a thesis that many people could never get to without a foundation. So you can finally start answering, what do I know that nobody else knows? To others, the prediction will seem near impossible. But by first principles and foundational decision making, it will have been obvious.