Version numbering going forward?

I plan to create a much simpler single source of truth for version numbering, since the CMake and autogen.sh unification that I did is a terrible hack and doesn’t include MSBuild. Before starting this I would however like to have a discussion about how we should number versions going forward since the current scheme with even minor versions for stable and odd minor versions for development unnecessarily complicates the release process and has been abandoned by most projects. I wonder if it’s time for Graphviz to do that as well.

My proposal is to go all-in for Semantic Versioning 2.0.0, but I’m open for other suggestions as long as it’s something well documented.

This would mean that development versions would be named like the next intended stable release followed by a hyphen and a pre-release identifier according to this section. This identifier could be anything that makes sense, e.g.:

  • 2.46.1-dev.<committer-date>. This would be what is most similar to what we have today, but I think the precedence rules prohibits that since if we allow pushes directly to master, the committer date may decrease (or may it not?). If we prohibit pushes directly to master, I think it would be possible since the committer date of merge commits would always increase. EDIT: After thinking harder about it, I now think this is ok.
  • 2.46.1-dev.<build-date>. Simple to generate
  • 2.46.1-dev.<pipeline-number> Simple in CI, but not available outside
  • 2.46.1-dev.<serial-number> Could probably be generated and stored in CI, but it would be impractical outside.

You can probable come up with more clever ideas.

We can use alpha, pre, build or anything else instead of dev if we like.

NOTE: We cannot use build metadata (version followed by plus sign and identifier) only since it “MUST be ignored when determining version precedence”, but we can use it in addition to the pre-release identifier if we like.

What do you think?

1 Like

Semantic versioning is OK with me. Though I’ve never understood how you indicate an ABI break in semver. E.g. if you rename a public header file, this is an API-breaking change but not ABI-breaking. I.e. pre-compiled software will continue to work fine with new SOs.

For development versioning, I vote for 2.46.1-dev.<commit-hash>. Obviously this is non-numerical, but you’ve said it is ignored for version precedence. Moreover I claim version precedence makes no sense for development versions anyway. If the commit graph contains a diamond, commits on different paths of the diamond are inherently incomparable. While there is a total order over release versions, there is only a partial order over development versions.

1 Like

Sorry. I don’t understand. You ask about how to indicate an ABI break and then exemplifies with a non-ABI break.

Non-numericality is no problem, but not being able to compare for precedence is for pre-release identifiers.

I didn’t say that for pre-release identifiers. I said it for build metadata.

That would be true if we released development versions from other branches than master or if we didn’t always build from the current HEAD of the master (canonical) repo, but since we don’t, even development version will be strictly ordered.

Based on your commit-hash idea I invented another scheme:

  • 2.46.1-dev.<any-strictly-increasing-number>+<commit-hash>

Can we steal a scheme off some similar software?

1 Like

Yes, sorry, I should have given a clearer example. Here is the missing context in my head:

  • scenario 1: you rename a header. API break, ABI-backwards-compatible.
  • scenario 2: you change the type of a parameter in a function prototype in a header. API break, ABI-break.¹
  • scenario 3: you refactor a function’s implementation, preserving semantics.

Semver would say (1) and (2) both constitute a major version bump and (3) constitutes a minor or patch version bump depending on whether you fixed a bug or not. Here are my concerns:

  1. Semver does not discriminate between (1) and (2).
  2. Semver assumes you have the ability to classify your changes as bug-fixing or not.

I will put aside my concerns for item 2,² but item 1 is something I am wondering about. On the other hand, maybe no one cares about ABI for something like this?

Either I don’t understand your point or I disagree with it. Consider the following commit graph:

* Z
|\
| * Y
| |
* | X
| |
|/
* W

I claim there is no way to objectively order X and Y. The time stamps are influenced by local machine clocks as well as whatever rebasing or other commit refactoring has been done. Moreover, I will be so bold as to claim this is correct. It is accurate that X and Y are incomparable linearly because each contains changes the other does not.

Semver is the most established versioning scheme I’m aware of. Even if I don’t like it :wink:


¹ (2) implies an ABI-breaking change to me. You may change an int parameter to an unsigned int parameter, and everything continues to link and load fine. But in reality your software is broken due to the caller expecting semantics the callee does not provide.

² I basically subscribe to some variant of the Greg Kroah-Hartman school of thought on this one: in real world code bases, you will find numerous commits that fix bugs (either intentionally or inadvertently) but were not explicitly identified as bug fixes at the time. Making a decision on whether to upgrade or not based on the target version number is a game of self-delusion.

1 Like

I’m not disputing using semver, it’s that semver is a very broad standard (three numbers and a string basically) and we still need to figure out what numbers and strings to use. I’m hoping we can steal some other projects ideas about what numbers and strings to use (particularly the strings for dev versions)

2 Likes

True, but why are you concerned about it? I.e. why would it important do discriminate? I would say that the ABI is also a form of API. At least for the purpose of version numbering.

No, it always constitutes a patch version bump (if you even make a release based on that change).

No, it assumes you have the ability to classify your changes as a change of the API or not. It never classifies changes as non-bug-fixing. Any release can contain a bug-fix.

True, but semver does not attempt to solve that problem. You seem to think that a minor version cannot contain a bug-fix. That’s incorrect.

I complely agree. My point is that we always make releases from the tip of the master branch of the canonical repo and if X is the tip at one point in time, Y cannot be the tip at another time because Git will not allow you to push one without first incorporating the other through a merge or rebase (unless you force-push of course, but I hope we can agree that this is out of the question). In your example it’s a merge and it creates the new tip Z which is strictly newer than both X and Y.

The two scenarios are:

  • A release is made when X is the tip. We then want to make a release containing Y. This release must be made from Z.
  • A release is made when Y is the tip. We then want to make a release containing X. This release must be made from Z.

I don’t have a use case myself, but for anyone shipping a binary that links against Graphviz libraries this is the difference between whether they need to recompile their application or not.

How so? Quoting the semver docs:

Given a version number MAJOR.MINOR.PATCH, increment the:

  1. MAJOR version when you make incompatible API changes,
  2. MINOR version when you add functionality in a backwards compatible manner, and
  3. PATCH version when you make backwards compatible bug fixes.

My interpretation of this is that changes that are not bug fixes bump the minor version.

I was thinking about development versions, not release versions. X and Y will have differing development version numbers. E.g. if I checkout X and compile it, I expect to get a different version number than if I did the same with Y.

Release version numbers, I agree, are always totally ordered.

If you are concerned with avoiding unnecessary recompiation, I grant you that you cannot do that. But it’s a problem that semver doesn’t even attempt to solve. What it solves is differentiation between the case when recompilation is guaranteed to be unnecessary and the case where it might be necessary and should be treated as if it actually was.

I agree that in isolation, that text can be interpreted like that. I still argue it’s wrong. My interpretation is (emphasized text is my additions):

  1. MAJOR version when you make incompatible API changes,
  2. MINOR version when you add functionality to the API in a backwards compatible manner, and
  3. PATCH version when you make backwards compatible bug fixes or other changes that do not affect the API

You can rightfully ask how I can be so presumptuous (dishonest, unreasonable, irrational or just plain stupid) to alter the text just to make my case. Therefore I will provide evidence to support it.

TL;DR

  1. Minor version … MAY include patch level changes .

TS;WM

General comment about the quoted text

The text that you and I have been quoting above is sloppy formulated. He makes the assumption that the only reason for making a release that is not changing the API is to fix a bug. This is of course wrong since e.g. a performance improvement could warrant a new release and it doesn’t fall in any of those categories. To his defense I can say that that text is just an introduction and not the formal specification. Your example does not either fall into any of the categories.

Detailed evidence

In the introductory text he makes clear that semver is all about communicating whether the new release contains changes to your public API or not and whether such changes are backwards compatible or not.

The introductory text says (my emphasis):

Once you identify your public API, you communicate changes to it with specific increments to your version number. Consider a version format of X.Y.Z (Major.Minor.Patch). Bug fixes not affecting the API increment the patch version, backwards compatible API additions/changes increment the minor version, and backwards incompatible API changes increment the major version.

Again, he makes the assumption that I mentioned above, but the priority is clear.

From the formal specification (my emphasis):

  1. Patch version Z (x.y.Z | x > 0) MUST be incremented if only backwards compatible bug fixes are introduced.

Again, the same incorrect assumption, but otherwise clear.

From the formal specification (my emphasis):

  1. Minor version Y (x.Y.z | x > 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API. It MUST be incremented if any public API functionality is marked as deprecated. It MAY be incremented if substantial new functionality or improvements are introduced within the private code. It MAY include patch level changes. Patch version MUST be reset to 0 when minor version is incremented.

This is the final indisputable (in my mind at least) conclusive evidence for that my interpretation is the correct one.

Since a bugfix is allowed in a minor release, the argument that a change that is NOT a bugfix MUST warrant a minor release is not logic.

I grant you that if “substantial new functionality or improvements are introduced within the private code” by this example, it COULD warrant a minor release, but not if it’s just a simple refactoring. A simple refactoring wouldn’t warrant a new release at all, but if you decide to do one anyway, it MUST be a patch release if its the only change.

No apologies for being blunt this time since you don’t want any. I grant you that the specification contradicts itself regarding bugfixes but if you read the whole document, I think that the intention is clear and in favor of my interpretation. If you still don’t agree, I will rest my case.

Maybe I’ll open a pull request with the changes I did above, but given the age of some of the currently open pull requests, I don’t know if it’s worth the effort. EDIT: I did open a pull request after all.

Ah, great. Then I understand why we don’t understand each other. I’m talking about the development snapshots that are buildt in the CI/CD pipeline and released here. Maybe I’m using the word release incorrectly here. Maybe I should just say deliver.

Development versions that are built manually are normally always inherently untraceable since there is no way to determine how they were built in retrospect. Perhaps we should have different pre-release identifiers for CI/CD built development versions and version that are built manually.

Ah, OK, I think I now follow (and agree with) your thinking about semver and versioning of development snapshots. Thanks for the detailed explanation.

The one thing I still have a problem with is:

While I said I don’t have a use case for this myself, this is not a theoretical problem. Graphviz is packaged in Debian and friends (albeit lagging the current version). Anyone can create a package that depends on Graphviz in order to link against a Graphviz library. If the Graphviz version numbers have no way of indicating an ABI break, such downstream dependents will have to assume every API-breaking change is also ABI breaking. And therefore they need to recompile their application and bump their version number.

Maybe this is acceptable. I’m have not asked anyone maintaining such a package. Having said that, moving to semver is an improvement over the status quo so I don’t think these concerns should hold us back. Just something to keep in mind going forwards.

1 Like

:tada: :+1:

I acknowledge that.

Given that Graphviz is at major version 2 after ~30 years, I would consider this almost a theoretical problem :grin:. No offense; mostly meant as a joke

Yes, fair point. I had not really thought about this, but you’re right that it’s unlikely anyone is relying on these kind of guarantees right now anyway.

1 Like

@smattr Another twist on the API/ABI breaking changes:

Or maybe that was what you already had in mind?

1 Like

Interesting. This was not quite the way I was thinking about it, but it does explain to me why semver does not explicitly deal with ABI.

1 Like

So we can’t use dash (-) in dev versions because RPM doesn’t allow it. After reading

I settled for tilde (~) instead. Underscore would have been my second option.

Comments?

Vincent Bernat’s reply in that thread makes an interesting case for using “+” instead: https://github.com/semver/semver/issues/145#issuecomment-109044744

though grv87’s later comment suggests this might caused problems: https://github.com/semver/semver/issues/145#issuecomment-385239489