Requesting new Docker images with ghostscript

Ah, thanks, I think I see. The first time I tried to build Graphviz and encountered both CMake and autotools, I wasn’t sure which to run and just picked one at random. This explains some of my confusion. Do you think we should standardize on a single build system purely to minimize maintainer workload? And if so, is CMake the only sensible cross-platform choice? I’m not suggesting doing this in the short term, but rather towards the end of the year or next year. @Ellson, please slap me down if this is a non-starter.

Now the ps2pdf issue… we’ve basically switched from a scenario that was developer-unfriendly (pdfs would silently fail to build) to one that is user-unfriendly (missing ps2pdf will fail the entire build). IMHO the ideal state would be for ./configure to detect ps2pdf and disable these targets if it’s not found. That way users get their passing build and developers can notice (or depend on) the missing pdf outputs to observe the failure. This doesn’t seem controversial, but please let we know if you already disagree.

How we get there is another matter. I’m not familiar enough with autotools to achieve the state I’ve described. Anyone else?

I don’t know much about build systems, but at my former employer, the C++ guys was strongly in favor of CMake and we were in the process of switching to CMake from an old in-house build system based on (I think) autotools. So my two (wait, just make it one) cents goes to CMake.

I agree regarding the ps2pdf issue. How we get there, I don’t know.

So who is the end-user that we should be optimizing for?

When I started on graphviz I would have described the end-user as someone with developer skills, probably on a Unix system … Sun, HP, Amdahl … someone that was used to installing software with
./configure; make; make install
and someone who expects ./configure to figure out maximum functionality on their system

Today I don’t think that is our end-user. Today the end-user does a:
yum install graphviz
and chooses functionality by which sub-packages they install. Dependencies are handled automatically.

So we now have an intermediate stage (user?) which is a maximally equipped for a specific distro, i.e. our gitlab-runner build hosts (docker images)

I believe that we want our CI builds to break hard and early if they do not produce all features for the packages of the target system.

I think of the “./configure;make;make install” user as more of a developer today.
We can still debate if ./configure shouldn’t allow ps2pdf to soft fail for the developer
user, … but I don’t think this user is our primary target any more. They are still important, but they can be expected to deal with stricter environment requirements.

re: cmake – can somebody talk about how complete they are for Unix builds?

Did you mean how complete CMake as a build system is? Or how complete Graphviz’s CMake files are?

The answer to the former is, I think, pretty comprehensive. CMake runs on at least Linux, macOS, Windows and FreeBSD. Are there other operating systems we need to support?

The answer to the latter is, I think, not very complete. As @scnorth has discovered, the CMake build does not seem to work on macOS. There’s nothing inherent in CMake that should cause this, so I think there must be some incorrect logic in our CMakeLists.txts. When I had a brief glance through them, I noticed a lot of unorthodox things so I suspect it would take a significant amount of effort to get this to parity with the autotools setup.

The alternative path I see is to delete the CMake support, leaving autotools as the one true way to build on *nix/macOS and MSBuild as the one true way on Windows.

I pushed new Docker images yesterday which are a union of master and not yet merged Dockerfile changes: Add rtest (!1395) · Merge requests · graphviz / graphviz · GitLab and Change to install libedit-devel instead of symlinking (!1400) · Merge requests · graphviz / graphviz · GitLab.

If anyone has an idea on how to better work with branched Dockerfiles and a linear Docker registry, please let me know.

I have some vague ideas. Mainly that the docker registry doesn’t have to be linear if we can use immutable SHA tags, and cut our dependency on always using the mutable tags like :ubuntu-20.04.

  • I think we should build the docker images on CI and tag them with their SHA hash
  • If we did this today, graphviz/graphviz (fedora33) would collide with graphviz/graphviz (ubuntu 20.04). So I think we might want to move the docker path to include the distro name, so that the tag is free to use as the SHA hash.
  • Alternatively we could append the SHA and the distro name to the tag (graphviz/graphviz:fedora33_1324324abcdef).

Once we did this, we could potentially stop using the linear tags (:fedora33, :ubuntu-20.04) and instead refer directly to the commit hash whenever we use the image? We could still use “:latest” as a cache, but always push to a SHA hash tag which would override latest. The cache hit rate would be pretty high.

It’s a bit frustrating that it seems like you have to think pretty hard how to use docker tags to use it in a way that makes sense with CI. I’m certain there’s some prior art here we could steal, e.g. https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#using-docker-caching shows building and pushing to a commit-hash tag and :latest:

  • docker build --cache-from $CI_REGISTRY_IMAGE:latest --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --tag $CI_REGISTRY_IMAGE:latest .

  • docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

  • docker push $CI_REGISTRY_IMAGE:latest

Then we could run the tests based on the image $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA.

WDYT?

1 Like

LGTM, but I’ll get back when I’ve had time to contemplate (which I don’t have today).

Perhaps there’s some way to pull images using digest of the image too, rather than tag. I see that’s probably the approach taken for pinning by the golang images (https://hub.docker.com/_/golang?tab=tags&page=1) and python images (https://hub.docker.com/_/python), where they don’t use SHA tags.

Thanks, Mark. I think this is worth investigating at some point later on.

We never really coalesced on an answer of what to do about CMake vs autotools. AFAICT the CMake setup is very incomplete, with no targets implemented for some of the artifacts we care about. I see two reasonable approaches:

  1. declare the CMake build unsupported and delete it
  2. declare the CMake build the way forward and work towards parity with the autotools build, followed by deleting the autotools build

To reiterate my opinion from before, I don’t believe we have the resources to maintain these two build systems in parallel with equivalent functionality.

CMake is the more modern and cross platform of these options. It would likely let us unify the *nix and Windows builds, thus maintaining less infrastructure in future. Having said that, my impression is that there is a lot of work to do to get the CMake build on par with the autotools one.

Opinions? Outrage?

I’m in favor of CMake. I’ve just successfully built for Windows 10 using CMake, although I’m not really sure if it’s “pure” since you have to specify which Visual Studio version you are targeting.

Do we have any log files from attempts to build with CMake under Linux? Otherwise I will make an attempt myself. I would be interested to help out getting it to run, although I can’t contribute with any knowledge (yet).

The autoconf/automake mechanism is what is used currently for all Redhat rpm builds, and for the

traditional, and as we have discussed, more tolerant, ./configure;make;make install

If you want to work on a CMake alternative, thats fine, but it has to coexist with the autoconf… version
until such time as it replaces all the capabilities and has been thoroughly tested for all use-cases.

John

Oh yes, and before you start replacing the unix tooling, could you please solve the deficiencies in the Windows build:
- Doesn’t use a single point of version numbering across all platforms
- Doesn’t use a CD process that is compatible across all platforms
so that we can have a single point of distribution for our binary images.

And then I have some questions based on my ignorance:
- Do the Windows builds use libtool? Or do they support dynamic loading
of plugins by some other means. Do they even support plugins?
- In the CMake world, is there something equivalent to the ./configure;make;make install
use-case that is tolerant of missing dependencies and supportive of
more unusual use cases? For example; what would be the equivalent of: ./configure --enable-static ?

I’ll answer what I can.

In the porting to GitLab runners that I’m currently working on we will achieve the latter for sure. The former will at least get rid of the version in appveyour.yml. I’ll see if we can automate also the one in CMakeLists.txt.

Magnus,

Would I be correct in assuming that for the new Windows runner that you will be using the first stage pipeline as is, and then using the artifacts from the first stage

for the Windows build using CMake ?

If so, then please note that the first stage pipeine uses autogen.sh (which uses autoconf, automake), although I’m fairly sure this can be independent of
use of CMake in the second stage pipeline.

If you are really going all out on CMake, then I suggest at least deferring the first stage pipeline to a later development phase.

My plan is to do as little as possible initially to replace the Appveyour builds without loosing any functionality and then make further improvements. Some improvements (such as those we discussed above) will happen automatically or because it’s simpler than the alternative. Appveyor today builds from the git repo, not from the portable source and this will not change initially. Once we have build and deploy through GitLab CI/CD and have closed down Appveyor and updated the documentation I will look into improvements.

Ideas I have:

  • Create Windows Docker containers with updated dependencies and remove Erwin Janssen’s windows dependencies submodule. This will hopefully speed up the builds and be something to build on for the future.
  • Review which features and plugins are not included in the Windows builds and try to add them.

I guess it would be possible to add also the files needed for Windows builds to the portable source so that we can build from them, although I’m a bit unclear of why this is an advantage. Doesn’t almost everybody just clone a git repo and build nowadays?

“I guess it would be possible to add also the files needed for Windows builds to the portable source so that we can build from them, although I’m a bit unclear of why this is an advantage.”

No, I wasn’t suggesting that. We don’t currently have Windows users who build from the tar.gz and that doesn’t bother me too much.

What I’m suggesting is using artifacts from the first stage. That is, the complete fresh git clone
that has been minimally configured to include single-origin VERSION information.

There should be no need for any of the second stages to go back to the GIT repo.

Submodules are something only Windows use at this time, AFAIK. But submodules can also be cloned by the first stage and left as artifacts for the second.

The second stage can start from the clone prepared by the first stage, and so has “portable” information from the first stage already there (e.g. VERSION) (e.g. graphviz.tar.gz, but not necessarily for Windows).

“Doesn’t almost everybody just clone a git repo and build nowadays?”

Hopefully almost nobody. Just us primary developers and perhaps maintainers for official distributions.
Our goal, I feel, is that everyone (the majority of users) should use binary packages that we (or Redhat, Ubuntu, … ) have prepared for them.

Yes, everyone “can” have access to our GIT repo, but very few have the skills or need to use it directly.

Oh, I see. The git repo is however not an artifact. A new clone (from the same commit) is created for every job. Git operations are fairly cheap, so I see no point in optimizing them.

The artifacts from the portable source stage is VERSION, COLLECTION and graphviz-*.tar.gz. Initially those are ignored, but I agree that is makes sense to use the VERSION file as a later improvement.

Yes, the submodules are used on Windows only.

I agree. When I wrote “everybody”, I meant “everybody who builds”. Sorry for being unclear.

I have to admit to being a little unsure about what officially constitutes an artifact. Does it have to be explicitly listed?

In my limited experimentation on the Windows runner it seemed like the entire clone was present in stage 2, even though it was not all listed as artifacts in stage 1. I was thinking that perhaps GitLab had some kind of shared filesystem to minimize the cost of pipelines. If this is not official then presumably we could be more explicit about what artifacts are needed.

Yes

It is present, but as I said created as a new clone for each job. Have a look at the top of the “Complete Raw” log. Here’s the latest Centos6 build log

Sorry, I don’t understand what you mean.