GitHub has the opportunity to streamline and secure the package management layer. Here's how.
GitHub is the system of record for code. But the company rarely takes advantage of this. GitLab, on the other hand, has used this fact to build out products that span the entire software development lifecycle. But GitHub's strength is the sheer amount of public projects it has – projects that end users consume mostly through package managers.
How does it work today? When a developer updates a package they follow roughly these steps:
- Make some code changes and push to Github
- Tag that revision with in git (e.g., v1.0.1) on GitHub
- Publish a release on GitHub
- Use that same tag and bundle the code into a zip file
- Publish to a package manager (e.g. npm for JavaScript/TypeScript)
Not only does the package manager have three pieces of redundant information (code, version, and package name), there's no guarantee that these correspond to the open source code on GitHub. Here's a quick list of a few things that go wrong in this process.
- Squatters sit on a popular name, so that a project needs to publish its packages under a slightly different name
- Malicious code is uploaded – does not match what's on GitHub
- Package is maintained by someone else, not the author of the code
- GitHub is updated, but the author hasn't published the release to a package manager yet so users can't use it
GitHub can fix all of these issues simply by maintaining its own package registries that conform to each language's requirements (i.e., npm endpoint for JavaScript, pip for Python). Packages would correspond 1 to 1 with published releases.
Users could either configure their existing tools to point to GitHub's endpoints or GitHub could publish its own tool that covers the most popular languages.
Why would GitHub do this?
- It already owns npm, so that seems like aligned incentives to me
- Better data on library usage – downloads through package managers don't go through GitHub
- Quality-of-life improvements for open source maintainers
- Easier (and safer) to use third-party code which turns the flywheel at GitHub