In the svg output format, how does graphviz ensure that the <polygon> element must contain text content?
I assume that an English character occupies the length and width of a fontsize unit, but there will always be some cases where the fontsize will always exceed the interior of the rectangular container.So I’m curious how graphviz implement it?
To measure the length of text in svg, is there any related paper to support it?
The way I can think of is to enumerate the length of all characters in all fonts, then query this table to get the length of a single character, and then take the total length.
The graphviz program loads one font file from disk and uses that font file and the “font metrics” inside the file which explain the sizes of each character in that font.
Sometimes this runs into trouble: graphviz might run on a computer with a font, and another computer might show the SVG without the font, and have to fall back to another font. Then the orignally-computed text size is wrong, and the svg rectangle is now the wrong size.
Graphviz calls a driver to get the size of rendered text. (It’s a little more than just adding up the sizes of glyphs - for example there may be kerning or ligatures to take into account.) Usually the formatter is the same as the renderer but it doesn’t have to be, as explained in Command Line | Graphviz
The cairo SVG generator is interesting in that it embeds the fonts in the generated SVG, to get around the problem Mark mentioned, where there are different font files on the computer (or environment) where Graphviz was run to make the SVG, and on the computer where the SVG is being displayed. It works well but at the cost of increasing the size of the output SVG and making it more difficult to postprocess (if that matters to anyone but usually it doesn’t).
I think the only way to make sure you know absolutely what you are getting is to render to a raster format like PNG.
In at least one of the core drivers (like psgen) there used to generated code that would compensate for some of minor differences in text sizing, but we may have given up on that because the semantics are complicated and not supported by most other drivers anyway.
I’m not sure whether it’s surprising that 20+ years after we started, this aspect of operating system APIs hasn’t improved much. Probably there is little incentive to do more about it. Once in a while (as recently as last week) I still see PDFs of scientific papers that render wrong. It’s nothing to do with Graphviz. I think sometimes it’s a stale cache or failure to refresh some font cache, in my case in the MacOS I’m using.
Thanks a lot,I didn’t know much about font metrics before. According to your introduction and reviewing related materials, I have a general understanding of how graphviz does handle between “font family” and “text size”.
My guess is that enumerating all characters for all fonts is not doable in the general case. And would be quite challenging even in a bounded universe with limited users (though maybe a fun challenge).
Not sure why it’s not in the website. git log --summary | grep fontfaq shows no results for it ever being in the current website’s git repo (est. 2017).