I’m very curious about the implementation of the dot algorithm and have been trying to implement my own version as a learning exercise, I had a few questions though that I was wondering if anyone could answer?
The following questions all relate to the Crossing Minimisation step
From the dot implementation paper :
init_order initially orders the nodes in each rank. This may be done by a depth-first or
breadth-first search starting with vertices of minimum rank. Vertices are assigned positions in
their ranks in left-to-right order as the search progresses. This strategy ensures that the initial
ordering of a tree has no crossings. This is important because such crossings are obvious, easily avoided ‘‘mistakes.’’
I’m not sure I fully understand this step? It mentions This strategy ensures that the initial ordering of a tree has no crossings. But I thought removing/minimising crossings is what subsequent the code is supposed to be doing? Can you give an example of what its doing?
A lot of papers I have read relating to Layered Graphs mentions randomising the order of the vertices in the top rank/layer after each iteration pass so you get new values to test. However, I see no mention of randomisation in the dot paper for cross minimisation. Is dot not using random sorting? how is it generating different orders to progress the state? is it solely relying on the initial state and tranposing?
The problem I am finding is that my code often gets stuck in a local optimum, how does dot get around this issue?
From the dot implementation paper :
One final point is that it is generally worth the extra cost to run the vertex ordering algorithm twice:
once for an initial order determined by starting with vertices of minimal rank and searching out-edges,
and the second time by starting with vertices of maximal rank and searching in-edges. This allows one
to pick the better of two different solutions.
Is this basically saying that it runs the algorithm twice, with the second pass essentially flipping the graph upside down?
<Pass 0>
[A, B, C]
[D, E]
<Pass 1>
[D, E]
[A, B, C]
?
I should also note that I am also trying to make sense of the C++ code, but as its very optimised, with little commenting, so I’m finding it challenging
Thank you for your interest in our code and for trying to read the 1993 TSE paper. I can answer some questions.
By now you probably have seen the Wikipedia article that gives a good overview of the area.
Since that paper was written, there has been a lot of work on this, so you can find better explanations and better algorithms. Kauffman and Wagner Springer LNCS 2025 is intended to be a good tutorial. Some other studies worth noting are Bachmaier and Forster 2006 who address intra-level edges (which we don’t at least not very well), and Brandes and Köpf 2001 (which is addressed by an erratum Ulrik co-authored later, so I would start there.)
As you discovered, this method is mostly a collection of heuristics, and there are a lot of places where there are alternatives that can work equally well or better.
Well, you’re probably right. If there were crossings initially, we would hope to get rid of them anyway. But the real point is that we get an initial layout of series-parallel graphs that are without crossings. I’m not sure Sugiyama’s heuristic ensures this otherwise, anyway, we felt it was a good approach.
It’s the sorting that mainly moves the optimization ahead, with some tricks to help escape local minima. We don’t randomize initial node placement because we see the input order of nodes as being very useful information. Yes there is definitely a problem with getting stuck in local minima, in a way that’s the whole problem here. I think in addition to the transpose heuristic, there’s some kind of untangling phase for long edges that is run at the end.
We get stuck in local minima. That’s the crux of the layout problem really. What a lot of people have noticed is that in graph layout a “pretty good and not obviously wrong” layout is acceptable for most purposes (or at least it’s the best we know how to do).
Yes. We don’t flip the data structure, just run the iteration in the opposite direction.
I hope some of the references will be helpful. There might be some Vimeo or Slideshare talks that help too.
I’m still unsure about question 1 though. When generating the series-parallel graphs, you have to start with an initial order on layer 0 to perform the Depth first search, right?
The order of this layer will change the success rate of how many crossings you get after the Depth first search. So how is this order decided?
For example, lets say after making the graph acyclic, you end up with layers :
L0 : [A] [B] [C]
L1 : [D] [E]
(Assume there is edges connecting nodes)
At which point you now perform init_order and perform a depth first search to order the nodes in each layer.
Depending on the order of [A] [B] [C] you would get a different result from init_order. So how is the order before running init_order determined?
It’s simple. The underlying library keeps nodes and edges in their input order. We create a queue initially empty. We scan (in order) to find all nodes with indegree 0 and enqueue them (maintaining that order). Peel off the node at the head of the queue and place it in its level. Scan the outedges of that node and for each node, if all the neighbors of its inedges have been placed, then it’s enqueued. Repeat this until the queue is empty.
Also try a second pass where we exchange inedges/outedges in the above.
To cope with flat edges, after the above is done, if there are any flat edge constraints, I think effectively a topological sort of the nodes in the same level is computed and any conflicts are resolved by reassigning the node positions.