GitHub Copilot

Jul 10, 2021

GitHub and OpenAI launched a new product called Copilot - an AI model that suggests the following line of code to write. Copilot learned from all the public code on GitHub (unfortunately, including mine!). So should developers be worried about their job?

No, but it is a watershed moment for the industry. Copilot has the potential to increase developer productivity significantly. Why is Copilot different? The scale of the training data and compute. Other AI code autocomplete suggestion models have taken similar approaches but didn't have access to a large corpus of data or compute available to Microsoft and OpenAI.

Copilot is just the start of machine learning on code. We have a massive amount of potential training data, and it is trivial to turn code into structured data. Machine learning has the opportunity to automate much of the busy work that stops developers from doing their core tasks. Some ideas besides code completion:

  • Programming language translation. Turn python in javascript.
  • Bug detection.
  • Resolving merge conflicts. See my Twitter thread on this.
  • Maintaining forks. Forks usually have some fundamental differences but need to keep up with upstream patches. Developers patch forks manually right now.
  • GPL/License infringement detection.
  • Synthetic datasets