Everything is a Graph

Jul 14, 2021

Graphs are possibly the most powerful data structure in mathematics and computer science. Moreover, graphs are found in curious places.

First, a graph is defined by a set of things (vertices) and a set of relations (edges) between them. There are undirected graphs, like the social network on Facebook, where the edges are bidirectional, e.g., A and B are friends. There are also directed graphs (digraphs), such as a flow chart, where relations have a direction, i.e., A to B is different than B to A.

Excel. Every time you add a cell reference, Excel adds a new edge to a directed graph. Then, to figure out what other cells need to be recalculated, it topologically sorts the graph to figure out what to do first. A topological sort turns a graph into a list of tasks that must be done in order.

Package Management. Packages depend on each other, which in turn depend on different packages. You end up with a large directed acyclic (no cycles) graph (DAG). Cycles cause issues because of unmatched dependencies, e.g., A1 depends on B1, but B1 depends on A2. Again, the package manager runs a topological sort on the DAG to determine what packages to install first and what needs to be reinstalled when one changes. Check out my post on the importance of package management here.

Graph Databases. Startups like Neo4j build a graph database that stores graphs. It makes it easy to ask questions like the shortest path between A and B? what is the centrality of A?

Software Pipelines. Software runs through a lifecycle of build, test, and deploy (at minimum). Each of these stages has its own pipeline - code must be submitted, compiled, packaged, tested, configured, and deployed. These pipelines (commonly known as Continuous Integration and Continuous Deployment, CI/CD) are just graphs that need to execute steps.

Machine Learning. Neural networks are examples of computation graphs. These graphs can have trillions of edges.

Compilers. Software compilers use graph theory for optimization and static analysis. For example, a control-flow graph (CFG) represents the paths that a program may take during its execution, e.g., if-else or for loops. Compilers use to try to run different algorithms on the CFGs to reduce or eliminate them.

Others.

  • Social graphs
  • Knowledge graphs
  • GPS or map routing
  • Computer networks

Software configuration. Last year, I wrote a graph-based programming language focused on describing directed and undirected graph configurations. You can read a post on it here.