Apologies that this isn’t directly about graphviz, but rather one of the third-party wrapper libraries — specifically, the Python pydot
package.
Pydot includes a limited parser for graphviz syntax, as it supports parsing from .dot
/.gv
files into API graph definitions. It doesn’t parse the bits of the code that it doesn’t directly interpret (HTML-like strings, for instance, are just passed through untouched), but it does have the ability to interpret basic graph, node, edge, and subgraph syntax.
I’ve discovered a few odd rules in the parsing for port components of node IDs and edge endpoints. There are definitions for syntax that, as far as I can tell, aren’t valid graphviz statements at all. While pydot appears to support them, at least in its parser, actually writing code in that form just results in syntax errors from dot
, neato
, and any other engine I’ve tried to feed it to.
These rules are so old they date back to the very first commit in the pydot repo, so there’s no hope of tracking down any sort of explanation for them via commit history.
So, I’m wondering if anyone recognizes any of this syntax, knows if there’s anywhere it might possibly be valid, remembers if it was ever valid, or… well, really, can offer any other insights.
If not, and since the code is at odds with both the documented grammar and the observed behavior of graphviz, I’m probably going to rip it out of the parser on the grounds that accepting it would only be creating invalid graph definitions that would break when they were fed back into the graphviz tools.
Pydot’s dot_parser.py
currently includes, all as components of the node_id
rule that can appear either at the start of a node statement, or on either side of an edge operation (--
or ->
), these parser rules:
port_angle
, which is defined as a literal@
sign followed by a valid identifier.port_location
, which is defined as either:- a possibly-repeating series of a literal
:
followed by an ID (correct, although too permissive)
OR - a literal
:
, followed by a literal(
, then an ID, a literal,
, another ID, and a literal)
. So,:(ID,ID)
- a possibly-repeating series of a literal
- Those are assembled into a
port
definition that’s either:- A
port_location
—:ID:ID
or:(ID,ID)
— followed by an optionalport_angle
—@ID
OR - A
port_angle
followed by an optionalport_location
- A
Except for 2.1, I can’t find any sign that any of those are permitted by the graphviz grammar or existing parsers. If they are it appears to be totally undocumented.
What gives? Is it syntax for some esoteric graphviz tool? Some other “graphviz-like” software that extended the syntax? Legacy syntax that used to be supported in graphviz, but was dropped so long ago there isn’t even any record of it in the gitlab repo’s history? Pydot’s parser just straight-up inventing its own, incompatible syntax?
…Any thoughts/guesses/wisdom?