It’s not optimal because there are gaps around the 8-21-10-20-cluster. Can it be made more compact? I found after some experimentation that the 25-21 node is the culprit. If I reverse it, dot gives me:
Which is much nicer. It appears to me that dot reasons that if there is an edge a → b then b most likely should be placed below a. That is perhaps too restrictive. If we could tell dot to use the dominance relation it would render call-flow and call-flow-like graphs better. It says that a dominates b if there is no path from any root to b that does not pass through a. So if a dominates b, b would be placed below a (rank(b) > rank(a)). In this case, with the root 1, 14 would be placed below 18, 22 below 27, but 21 would not (necessarily) be placed below 25 since 25 does not dominate 21.
It appears to me that dot reasons that if there is an edge a → b then b most likely should be placed below a.
This is a feature of dot. From the description page:
The layout algorithm aims edges in the same direction (top to bottom, or left to right) and then attempts to avoid edge crossings and reduce edge length.
If the graph has directed cycles, some edges need to be flipped. Dot uses a heuristic based on the order nodes appear in the input, which does a reasonable job of reflecting dominance based on how graphs are typically generated. However, this is not relevant for your example. Since neither 21 nor 25 dominates the other, dot would have to try both edge directions to pick a better one, for some definition of better.
We experimented with something like this in trying to support an aspect ratio attribute, but the algorithms we tried caused so many problems with other beneficial features of dot that we gave up. Please feel free to submit a modification request. Thanks.
I would probably like a layout algorithm that does not use the order of the nodes as a heuristic. Since the graphs are generated and not manually entered the order is mostly immaterial. So for me it would be better if Graphviz analyzed the structure of the graph to determine which nodes should be on top of others.
You could write a preprocessor that takes an arbitrary directed graph and turns it into a DAG. Then finds the dominance relationships in the DAG, and where directed edges are not in a dominance relationship, mark them as nonconstraint edges or minlen=0 edges. This would be a nice little intern project. It could be coded in any language really (say graphviz gvpr, or graphviz C/C++, or any graph package in python or R or Rust or whatever that can at least write in graphviz format.)
As a remark, the whole point of the dot layout is to put b below a if there is an edge a → b. If this isn’t important to your purpose, you might want a different type of layout altogether, such as neato. Also, it turns out that using the order of input heuristic for breaking cycles often works well with generated graphs, as the generation process tends to reflect domain-specific semantics which are then honored in the layout.