Topological Sort

8 Sep 2019

If I wanted to implement a toy version of make, what would it take? For instance, what if I had the following file, that looks sort of like a Makefile that has only targets and dependencies, and I wanted to give a program any target, and get back a list of sub-targets that would get built first, in the correct order?

main: main.so reader.so writer.so

main.so: main.o

reader.so: reader.o parser.o lexer.o

writer.so: writer.o buffer.o

main.o: main.c

reader.o: reader.c

parser.o: parser.c

lexer.o: lexer.c

writer.o: writer.c

buffer.o: buffer.c

main.c:

reader.c:

parser.c:

lexer.c:

writer.c:

buffer.c:

First, I'd have to understand that what I'm looking for is Topological Sorting.

Then, I'd have to figure out a way to represent the above information in Go.

Let's say I wrote a little parser that turned the above file into this data structure:

var graph = map[string][]string{
	"main":      []string{"main.so", "reader.so", "writer.so"},
	"main.so":   []string{"main.o"},
	"reader.so": []string{"reader.o", "parser.o", "lexer.o"},
	"writer.so": []string{"writer.o", "buffer.o"},
	"main.o":    []string{"main.c"},
	"reader.o":  []string{"reader.c"},
	"parser.o":  []string{"parser.c"},
	"lexer.o":   []string{"lexer.c"},
	"writer.o":  []string{"writer.c"},
	"buffer.o":  []string{"buffer.c"},
	"main.c":    []string(nil),
	"reader.c":  []string(nil),
	"parser.c":  []string(nil),
	"lexer.c":   []string(nil),
	"writer.c":  []string(nil),
	"buffer.c":  []string(nil),
}

As we can see, the map's keys are targets. The map's values are lists of dependencies, which in turn are targets.

If we want to give the name of a target and see a printout of the processing order of dependencies, we just need a little recursive function like so:

func build(target string) {
  if deps, found := graph[target]; found {
    for _, dep := range deps {
      build(dep)
    }
    fmt.Printf("BUILD %s\n", target)
  } else {
    fmt.Printf("TARGET NOT FOUND: \"%s\"\n", target)
  }
}

If we provide a target that has dependencies, we recursively call build() on the dependencies.

If a target has no dependencies (the base case that escapes our recursion), we print the fact that we are building the target.

For instance, if we want to build reader.so, its dependencies have to get built first. A little program that uses the above will give us the right answer:

$ ./toymake reader.so
BUILD reader.c
BUILD reader.o
BUILD parser.c
BUILD parser.o
BUILD lexer.c
BUILD lexer.o
BUILD reader.so

This is, of course, a toy example, because we do nothing to detect cycles, or any other robust stuff. But as with a lot of toy implementations, the basic idea is clear.

Of course, there are tools that do this for us, and not just make! There is also the tsort command-line utility.

It's not quite as sophisticated as our toy make, in that we cannot give it a target; we can only get a listing of all the targets in correc build order. But, if we format our input file in a format understandable by tsort:

main main.so 
main reader.so
main writer.so
main.so main.o
reader.so reader.o
reader.so parser.o
reader.so lexer.o
writer.so buffer.o
writer.so writer.o
main.o main.c
reader.o reader.c
parser.o parser.c
lexer.o lexer.c
writer.o writer.c
buffer.o buffer.c

tsort will show us the order in which targets need to be built:

$ tsort tsort.txt | tac
main.c
reader.c
parser.c
lexer.c
buffer.c
writer.c
main.o
reader.o
parser.o
lexer.o
buffer.o
writer.o
main.so
reader.so
writer.so
main

Which is pretty spiffy.