What defines the public API of Graphviz libaries?

This might seem as an academic question, but when setting the version of our releases it’s important in order not to surprise our users. You might say that it is the public headers themselves, but that’s clearly not enough since they do not say very much about the semantics. Also, in that case they could per definition never be wrong.

I can think of several definitions:

  1. What’s documented. If so, which documentation? We have man pages, PDFs and the web pages.
  2. What’s de-facto been in the library for a very long time.
  3. What’s (apparently?) intended to be in the library or not.

The reason I’m asking is the new C++ API I’m working on. It’s going to be quite incomplete and probably change in a number of releases and I’m wondering if that’s a problem even if we do not document it until it’s complete? Or if we can document the parts that are implemented, but mark them as unstable or experimental in order not to warrant major releases when they change?

Any thoughts regarding this would be appreciated.

For versioning purposes, I tend to think of the API as headers and behavior. I.e. “does the user need to modify or recompile their code?”

For the C++ API, yes, I think we should mark it experimental and not bump major versions for changes in it at first. Otherwise almost every single release will be a major version bump or we’ll feel pressured to stabilize the API too early.

1 Like

For versioning purposes, I tend to think of the API as headers and behavior. I.e. “does the user need to modify or recompile their code?”

I agree, but my question is what defines the API, i.e. what defines what the user has a “right” to expect?

I have a few answers to this question, depending how I feel from day to day :slight_smile:

Answer 1: It’s all the documented behaviour, and the headers types.

Answer 2: at scale, every behaviour of the program is depended on by someone: https://www.hyrumslaw.com/

Thanks. Interesting reading. I think I’ve heard about it before sometime, but forgot.

I rest my case (for the time being anyway). :grin:

When Graphviz was written, API semantics were negotiated between Emden Gansner, John Ellson and me. We expected man pages, header files and the implementation to agree and took this seriously. Problems were considered case-by-case. Some dark corners, like the semantics of user-defined IDs in cgraph, were settled informally and probably not sufficiently documented. Later, Emden wrote API guides that probably have much more impact than man pages by now.

I liked the article about Hyrum’s law (iI’m fond of well-written blogs about dev practices in OSS) but graphviz is so insignificant compared to mainstream software like glib or networking libraries that it seems unlikely that many other people are familiar with undocumented features of the implementation.

1 Like

See also xkcd: Workflow

:rofl: Yes, I saw that. He linked to it at the end of the page.

I think that example illustrates very well why it’s good to have a documented API. In that example, I’m pretty certain that the previous behavior was not part of a documented API, i.e. the change was most certainly a bug fix, not a removal of a previously officially supported feature.

The goal can’t be to never unexpectedly break anything for any user, regardless of how they misuse the implicit interfaces. The goal must be to not unexpectedly break applications that adhere to the documented API. Of course there will be gray zones and sometimes we might even make implicit behavior part of the public API if there’s good reason for it.