Debugging constraint graphs passed to network simplex

I would like to fix remaining “trouble in init_rank” errors. By enabling some DEBUG code, I can see that dotgen/position.c inadvertently can create cycles. One recent example involves non constraint edges and clusters and the cycle involves at least constraint edge created by “keepout_othernodes”. One could speculate this is perhaps related to an edge label virtual node of a nonconstraint edge, possibly one that crosses a cluster boundary. One could maybe just try to code around that and see if it makes a difference. On the other hand, it would be generally helpful if, in debug mode (maybe enabled at runtime not compile time) all virtual nodes and even virtual edges could be annotated with text showing where they were created and some of the relevant data.

With whatever we’re doing now with safe strings in C, would this be easy?

I am very much in favor of this. It would be great to have always-compiled-in debug support, so users could more easily debug their issues and/or report them more accurately.

I am not quite sure what you mean by “whatever we’re doing now with safe strings” but I assume this refers to some of the changes I have made. I don’t think any of these changes have affected how strings are represented or managed. Could you clarify what you’re referring to here?

I would like to optionally create virtual node and edge names that show where they were created and the value of some variables when that took place. Is that still done with sprintf?

Oh, I see what you mean. Well, good news, the sprintf changes have not made anything impossible.

tl;dr: sprintf(foo, ...)snprintf(foo, sizeof(foo), ...

The longer version… Stephen, you’re no doubt aware of all the below, but I thought I would attempt a blow-by-blow explanation for the peanut gallery.

Graphviz (and other C code bases) contain code like:

char some_buffer[1234];
sprintf(some_buffer, "...", user supplied parameters...);

By passing in unexpectedly large parameters, it is often possible to overflow the bounds of some_buffer. This kind of buffer overflow corrupts the stack. The effect of ths is typically modifying other local variables or, more problematically, altering where control flow goes when you return from the containing function.¹

Over the years, this has proven to be a ripe source of safety and security vulnerabilities (just search the MITRE CVE database for such things). The main thing a miscreant can do with such a vulnerability is so-called Return-oriented programming. While Graphviz is notionally not a safety- or security-sensitive program, we have no oversight of the uses people apply it to.

This problem is particularly thorny for code bases like Graphviz that lived through ecosystem transitions like 32-bit → 64-bit:²

extern unsigned long x;
char some_buffer[11];
// safe on 32-bit x86, potential buffer overflow on x86-64:
sprintf(some_buffer, "%lu", x);

So, with the above in mind, I have been trying to move us to a safer idiom using snprintf:

snprintf(some_buffer, sizeof(some_buffer), "...", user supplied parameters...);

Sure, it probably won’t function correctly if you exceed the bounds of some_buffer, but you’ll at least get predictable, memory-safe behavior.

¹ The latter is possible because on architectures like, e.g., x86, the return address of a function call is stored on the stack. Hardware defenses like Intel CET and Clang CFI are capable of mitigating, and sometimes preventing, these issues, but it’s best to just not have them in your code in the first place.
² “But why didn’t you just…” I’m not even going to answer this implied question, but just refer people to

1 Like

Hilarious article, albeit hard to explain to people around me why I was laughing so much while reading it. Thanks for sharing.


It will take a little while to fully absorb this article and give it the consideration it richly deserves but he seems like a kindred spirit.

1 Like