Graphviz Design/Architecture

Do we have a design/architecture guide for graphviz? Assuming I would want to fix issues that I identified, how would I know where to look for the functionality to identify the right “package”? How would I learn about key concepts for each package?

1 Like

There’s a few markdown files in the git repository, and some directories have files with large comments at the top of them. I’m not aware of anything more overarching than that, sorry. The usual way I try to fix bugs is to file them in the gitlab issue tracker, then read the code until I figure out what’s going on, possibly asking for help on the issue tracker about unclear bits.

Mark is right. Maybe we can help with a specific example?

I was thinking of a high level explanation. I am a newbie to Graphviz: how do I start? What are the plugins used for (I have a vague idea from the graphviz as library document), what are packages used and what are the packages used for? (like the GTS package mentioned in another thread: I think a component level description describing the overall high level architecture would be really good.

A practical example:
If I would want to fix an error like:
How would I start? Which component to look for? What components have a role in this?

Another practical example: How does Graphviz handle fonts?

This is very incomplete, but here are some things that surprised me when ramping up on the Graphviz code base:

  1. There are three different build systems: Autotools, CMake, and MS Build. Autotools is the original build system (driven by and the files). MS Build exists to support Windows. CMake was introduced as an attempt to modernize things and unify the build, however it is still very incomplete and doesn’t build some libs/tools. Right now we’re in a liminal state where we need to keep all three working. Most of the maintainers have no access to Windows, so this is an interesting exercise.
  2. The test suite is incomplete and spread across rtest/, rtest2/, and tests/. It would be nice to unify this and re-enable all the disabled tests. Test coverage is very low, so don’t rely on the test suite catching breakage in your changes to existing code. I try not to commit a fix to any issue without an accompanying test case in rtest/ to prevent future regressions.
  3. The build sprays warnings, many of them serious looking. We’ve been trying to incrementally pay this down but it’s a long slog. Magnus set up a nice “metrics” feature that tells you on a Merge Request if your changes affect the number of compiler warnings.
  4. Graphviz has known memory safety issues; search the issue tracker for segfault or things filed by Google Autofuzz. Generally I prioritize issues that are affecting human users above these, but we should look at addressing these too in the long term. A side effect of this is that Valgrind/ASan are less effective for development than they could be because of the number of existing violations.
  5. The code base is currently stuck pre-ISO-standardization of C and C++ because we support CentOS 6. After the impending 2.46.0 release, we can move to C99 and C++11, but not easily anything newer.
  6. Related to (5), there’s a lot of micro-optimization and quirky tricks in the code base that are no longer relevant. AIUI, they came from a time when machines were much more constrained. These days, many of these idioms are considered anti-patterns and I’ve been trying to cautiously remove them.


1 Like

Thanks a lot Matthew! I didn’t realise that Graphviz uses cmake - but then I didn’t know cmake neither. However, I started looking into cmake in combination with Xcode.

Above is good starting point in many ways - even if it isn’t the overarching architecture. But right now I am more intending to get graphviz compiled as a fat binary for arm64 & x86_64. You will see my changes to cmake soon.

I currently wonder why we split common and pack builds into _obj (sharing on single source file base) and without _obj (static library). Is this one of the anti-patterns mentioned? I would love to just remove this from the code.

This I don’t know, though it has certainly caused problems and confusion in the past (e.g. #1613). The Git history may have a more informative answer to this question.

I am not sure whether this is the right thread to ask this as I am new to Graphviz. I am working on representing graphs visually in ELM (functional programming language), but the packages there are not as good as Graphviz. I have used Graphviz with ELM using JS transpilation. But I was thinking if it is possible to implement Graphviz natively on elm. If yes how should I get started with the implementation? The implementation might have only the positioning algorithm. Thanks to the developers for building such a nice graphing library.

Hi arc,

I think this would be indeed the right thread. But I am not sure that there is a lot of documentation for what you have in mind. I fear you will need to reverse-engineer huge parts of the algorithms.

I believe that the Graphviz as a library would be a good starting point to understand the overall structure of Graphviz. It as well tells you on page 4 where you can learn about the key concepts of the each of the algorithms. And it provides an understanding of cgraph, the underlying graph implementation.

I believe that you are mostly interested in the layouting part (section 3). Assuming you are interested in the dot layout: most of the relevant code will be in the dotgen package.

I see that ELM doesn’t allow C/C++ interoperability. Did you consider to create a graphviz server with a network interface? Basically, you have attributed nodes, edges and graphs (incl. clusters). You send it to the server (JSON?) and get the layouted positions, sizes and even edges (as splines) back. You can then draw it yourself. This way you will save yourself a lot of work and might be able to work directly with graphviz codebase towards mutual benefit.

Good luck!

This was deliberately introduced in and discussed in detail in