I’ve encountered a reproducible init_rank / triangulation failure that I’ve narrowed down to a specific structural pattern involving clusters and cross-cluster edges.
Targeting multiple nodes within the same destination cluster
Creating a pattern that confuses the rank assignment algorithm
The total number of cross-cluster edges doesn’t seem to be the issue (the working version still has cross-cluster edges), but rather the fan-in pattern of many edges from one cluster converging on multiple nodes within another cluster.
Would this be helpful for debugging? Is this a known limitation of the ranking algorithm with clustered graphs?
Sorry about the problems. It’s a bug we’ve seen before. “trouble in init_rank” almost certainly means there are cyclic constraints that make the network simplex solver fail. After that more things can go wrong, such degeneracy that bombs out the triangulation for spline routing.
Cyclic constraints are probably generated by some flaw in the logic of lib/dotgen/position.c
It’s quite difficult to debug. Maybe it would help if we had scaffolding in our code to generate tags (internal labels) on constraint edges annotating how they are created, so we could more easily understand why cycles can be created. Agree this has something to do with inter cluster edges, and maybe flat edges. Does your original graph have text labels on edges?
As you may already know, the extract marked “Failing pattern (simplified)” itself doesn’t fail by itself.
Yes, the full graph does contain real cycles (at the graph level). What’s odd is that despite those cycles, we often can render the SVG fine with dot, and the trouble in init_rank failure only appears for certain structural configurations (e.g., the cluster + inter-cluster fan-in pattern).
Our render settings are:
dot engine
rankdir=LR, nodesep=0.3, ratio=auto
On your suggestion: should we try to break/remove all cycles in the original graph and re-test to see if the issue still reproduces?
And yes: the original graph has text labels on edges (and node labels too). The simplified pattern was just illustrative, and as you noted it doesn’t fail on its own.
Bugs often seem odd, sometimes even after they’re understood and corrected.
It won’t help to remove cycles in the input graph. The constraint cycles we’re referring to here are created by some broken logic in the ~2000 lines of code soup in C that set up the linear program (i.e. another, internal graph) that is solved to get node coordinates (Y coordinates in your case with rankdir=LR).
It is, for all practical purposes, impossible to debug without creating annotations on the objects involved in the cycle to get a better idea where the constraint cycle is made. Even if we could infer something like “the cycles involve nested clusters with intra and inter cluster edges at least one of which has an associated text label’“ I think we need something more. Another option is to just stare at the code for a long time and hope the truth become evident.
I’m working on these annotations, actually, but no promises.
Understood — the cycles are in Dot’s internally generated constraint graph, not our input DOT, so removing input cycles won’t help. Thank you for clarifying,
Yes, GraphViz is known to have this peculiarity. Cyclic rank conflicts are caused by init_rank chokes, which occur when numerous edges from one cluster converge on several nodes in another. Because it converts cross-cluster edges into internal ones, relocating N5 is effective. Common solutions include lowering fan-in, applying `constraint=false` to some edges, or switching to `neato`/{sfdp` for large graphs.
This is helpful. I would like to fix this. I replicated the instrumentation a Sam Sneddon reported a few years ago in graphviz#1939, so when a cycle occurs, we can get a better idea why. (My approach feels clumsy; not sure how @gsnedders did this, but, kudos.). Sam points out that each constraint in the cycle seems plausible, but mincross has twisted at least two sibling clusters, so it fails. Not sure how this phenomenon relates to the existing of many cross-cluster edges. It appears, wishful thinking about mincross edge crossing penalty factors doesn’t prevent this bug.