Clustering gives undeterministic results

Hi,
I have this graph

digraph {  
  layout=dot ranksep=1 nodesep=1
  c_1 -> c_2 -> c_3 
  v_1 -> v_2 -> v_3 
  p_1 -> p_2 -> p_3 
  edge[constraint=false]
  c_1 -> v_1 -> p_1
  c_2 -> v_2 -> p_2
  c_3 -> v_3 -> p_3
  #subgraph cluster_root {
  subgraph cluster_c { label="c"  c_1 c_2 c_3  }
  subgraph cluster_v { label="v"  v_1 v_2 v_3  }
  subgraph cluster_p { label="p"  p_1 p_2 p_3  }
 #}
  edge[constraint=true]
  v_3 -> m_1
}

the second graph is the same and I just uncommented cluster_root

Then
dot.exe -Tsvg ./output/last/demo2.dot > ./dist/svg/demo2.svg
dot.exe -Tsvg ./output/last/demo1.dot > ./dist/svg/demo1.svg

And I did that multiple times… few screenshots demonstrate the different results.
Sometimes I have the right order, sometimes not.



Is there a way of writing a deterministic graph?

os : windows 10
graphviz : dot - graphviz version 2.49.3

Thanks

This is unfortunately a known bug, The order of clusters in gv or ps output is not stable on Windows (#1789) · Issues · graphviz / graphviz · GitLab

thanks,

so I have no other choice but to use Graphviz under Linux to have the right result directly?

By adding invisible nodes and edges to constraint the graph I guess, I almost have the desired result.
The cost is that I have some margin:

digraph {  
  layout=dot ranksep=1 nodesep=1
  node[shape="point" style=invis]
  edge[constraint=false style=invis]
  fc -> fv -> fp
  node[shape="ellipse" style=none]
  edge[constraint=true style=none]
  fc -> c_1 [style=invis]
  fv -> v_1 [style=invis]
  fp -> p_1 [style=invis]
  c_1 -> c_2 -> c_3
  v_1 -> v_2 -> v_3
  p_1 -> p_2 -> p_3
  edge[constraint=false]
  c_1 -> v_1 -> p_1
  c_2 -> v_2 -> p_2
  c_3 -> v_3 -> p_3
  subgraph cluster_root {
  subgraph cluster_c { label="c"  c_1 c_2 c_3   }
  subgraph cluster_v { label="v"  v_1 v_2 v_3   }
  subgraph cluster_p { label="p"  p_1 p_2 p_3  }
 }
  edge[constraint=true]
  v_3 -> m_1
}

If someone has a better idea, please share.

Thanks

Unfortunately I don’t have great suggestions for a work around. To compound matters, the problem actually exists under Linux as well but is latent. I.e. a future change in Linux could expose the same issue there.

The underlying problem is a classic treating-pointer-values-as-numbers situation, which affects iteration order. The values you get from heap-allocated pointers are not the same across executions due to things like ASLR. On Linux (and in particular under Glibc) things mostly work out because heap pointers are generally increasing for sequential allocations. We actually fixed the problem previously but the fix caused a regression in a Graphviz tool fdp and we had to revert it. I have not had a chance to debug the fdp breakage but the code looked pretty hairy.

“Why is Graphviz designed that way?” Well at the time this code was written, things like ASLR did not yet exist. Memory was precious and promiscuously passing values between pointer and int types was a common practice pre-ISO-C. One of the more senior maintainers may be able to provide further details.

Anyway, that was probably more information than you wanted to know and not particularly helpful to you. Deterministic execution is a fairly baseline property, and I’m sorry we’ve ended up in this situation where Graphviz lacks it. You may be able to achieve some determinism by disabling ASLR and friends, but be aware there are security implications to this.

An exacerbating issue is that your graph does nothing to specify left-to-right sequence for the inner clusters. This is perfectly “legal”, but it leaves dot on its own (so to speak).

The version below adds invisible edges to provide guidance. It also rearranges the sequence of the clusters. Sometimes dot seems to sequence objects “backwards” - in the reverse order that they were specified, so they are now introduced “backwards”.
Please run your loop test. Maybe it will be deterministic.

digraph {  
  layout=dot ranksep=1 nodesep=1
  subgraph cluster_root {
    // note that the clusters are now in reverse order
    subgraph cluster_p { label="p" p_1 p_2 p_3  }
    subgraph cluster_v { label="v"  v_1 v_2 v_3  }
    subgraph cluster_c { label="c"  c_1 c_2 c_3  }	
  }
  c_1 -> c_2 -> c_3 
  v_1 -> v_2 -> v_3 
  p_1 -> p_2 -> p_3 
  c_1 -> v_2 -> p_3 [style=invis]  // try to establish left-right order

  edge[constraint=false]
  c_1 -> v_1 -> p_1
  c_2 -> v_2 -> p_2
  c_3 -> v_3 -> p_3

  edge[constraint=true]
  v_3 -> m_1
}

I appreciated the comments here were kind not harsh.

I wonder if, in situations where we are carelessly treating pointers as numbers, if we can instead use the graph object ID or sequence number, to make the behavior deterministic.

Stephen

@smattr, thanks for all this information, even if I did not understand everything :slight_smile:
At least it will give good orientation for those interested in understanding what is going under the hood.

@steveroush, thanks for your proposition, I tried it but unfortunately, I still have non-deterministic behavior. Few calls to the server with the same file show the problem. I believe that adding parents to constraint the graph as I showed in my previous post is ( so far ) the only way to have deterministic behavior

and another time…

Kasra

Yes, I was not meaning to crticize the previous design choices. Most of this was common practice at the time, as far as I have seen in C code bases elsewhere. Unfortunately changes in the surrounding ecosystem have shifted the ground beneath Graphviz.

This is more or less what I did in the original fix. Unfortunately this made fdp very unhappy. It introduces a number of “virtual” objects and then later expects to be able to look them up by pointer address. Unorthodox stuff in fdp is also what’s blocking fixing HTML-like and non-HTML-like strings with the same content cannot coexist (#2089) · Issues · graphviz / graphviz · GitLab IIRC. We reverted the original fix as it wasn’t obvious how to adjust fdp. Since then I have still not had time to debug either problem.

This graph produced the same result 19 times in a row on Windows. It adds two invisible nodes and edges to nail the clusters in the “right” positions. But no guarantees.
[yes, there were many failed ideas]

digraph {  
  layout=dot
  ranksep=1 nodesep=1
     // subgraph cluster_M{ // new cluster not needed ??
     //  added invisible nodes to help establish desired cluster positions
     {rank=same
       m_1
       node [shape=point style=invis label=""] inviz_c  inviz_p
       edge [style=invis]
       inviz_c -> m_1 ->  inviz_p
     }
     // }  // new cluster not needed ???
  subgraph cluster_root {
    subgraph cluster_c { label="c"  c_1 c_2 c_3  }	
    subgraph cluster_v { label="v"  v_1 v_2 v_3  }
    subgraph cluster_p { label="p"  p_1 p_2 p_3  }
    //c_1 -> v_2 -> p_3 [style=invis]  // try to establish left-right order
  }
  c_1 -> c_2 -> c_3 
  v_1 -> v_2 -> v_3 
  p_1 -> p_2 -> p_3 

  edge[constraint=false]
  c_1 -> v_1 -> p_1
  c_2 -> v_2 -> p_2
  c_3 -> v_3 -> p_3

  edge[constraint=true]
  v_3 -> m_1
  //  added invisible edges to establish desired cluster positions
  edge [style=invis]
  c_3 -> inviz_c
  p_3 -> inviz_p
}

Giving:

Great !

I like your solution, we still have to create invisible nodes but at least the cost is low.
By the way, I beat you! I’ve got much more than 19 times in a row with the same results :wink:

Thanks a lot.