Removal of GV_FILE_PATH environment variable

Hi, I am revisiting some old code that uses the GV_FILE_PATH environment variable to point to other image files to be included in the generated image. After some searching I found that GV_FILE_PATH has been removed in release 6.0.1 [1].
I cannot find a rationale, nor a description of an alternative. Can you help?

[1] remove 'GV_FILE_PATH' documentation (4782c403) · Commits · graphviz / graphviz.gitlab.io · GitLab

Thanks,
Bram van Oosterhout

Funny you should ask about this. I’ve been pondering how to reinstate something like this myself.

Some of the background is discussed in Suggestions to deal with removal of GV_FILE_PATH support (#2396) · Issues · graphviz / graphviz · GitLab. In summary, having Graphviz try to unreliably sandbox itself via an environment variable was a bit of a security landmine. My current thinking is to just add a command line parameter (--file-path or something similar) that has a similar effect.

This thing definitely has utility, though not for sandboxing. Much like they say a lawyer who represents themselves in court has a fool for a client, a program that tries to sandbox itself is no sandbox at all. If you need a Graphviz sandbox, you need something else.

Thank you.
That gives me something to think about.
-Gimagepath can work for me.

Thanks for the pointer and super fast response.

Thanks again for the response.
I did some experiments with -Gimagepath. And I hit another hurdle. I am using dot in a web environment and SERVER_NAME is set. That produces a warning: “file loading is disabled because the environment contains SERVER_NAME=”. [1]

I can work around that by removing SERVER_NAME before calling dot and recreating it following the call.

I understand the security rationale for restricting file loading in a server environment.
And I would be greatly helped if the restriction could be lifted through a command line option or configuration. I am thinking of:
-GnoImagePathInServer=false
or an option without a double negative:
-GdenyImagePathInServer=false

Would that be acceptable as an approach? Or are there other considerations?

[1] lib/common/utils.c · main · graphviz / graphviz · GitLab lines307-315 (safefile)

Kind regards,
Bram

I think I’m not understanding your use case. Why would you need to recreate SERVER_NAME?

If you just want to regain the ability to load external files, prefixing your dot invocation with env --unset=SERVER_NAME should do it.

My apologies. I noticed my response email bounced.

Hi there,
Thanks for following up.

I am using dot inside a wiki [1]. I am not very familiar with the inner workings of the web server. Here is what i think happens:

The web server sets SERVER_NAME.
dot detects SERVER_NAME is set and decides NO imagepath.

instead I now wrap [2] the dot invocation in
delete SERVER_NAME;
dot … ;
set SERVER_NAME;

The last statement (set …) is to make sure the web server and any other logic does not get confused about the environment. My insecurity is showing.

[1] GraphvizPlugin/lib/Foswiki/Plugins/GraphvizPlugin/Core.pm at master · foswiki/GraphvizPlugin · GitHub line 164
[2] GraphvizPlugin/lib/Foswiki/Plugins/GraphvizPlugin/Core.pm at Item15299 · BramVan-Oosterhout/GraphvizPlugin · GitHub line 228

Bram

Bram van Oosterhout

11:59 AM (6 minutes ago)

to Graphviz, bram
In response to the suggestion:

Yes, I could create a script: dot-env_without_servername.sh:
env --unset=SERVER_NAME; dot “$@”

I expect that will work in most current Unix environments. But it would need a separate solution on Windows.

Perl takes care of the portability with the delete/set.
Supporting the switch in dot will do that too.

Regards, Bram

Is it considered bulletproof that if SERVER_NAME isn’t set, then it’s ok to read arbitrary files?

This is merely a heuristic I coded back when the world was considerably simpler and less dangerous.

I doubt that it is bullet proof in all environments. But it is definitely an obstacle if one wants to use dot inside a script from a web server.

Once the SERVER_NAME is removed, dot does not find any other impediment in my environment.

Are you aware of any other safeguards implemented in dot?

BTW, if -GdenyImagePathInServer=false switch implementation is considered, I think a better name would be:
-GImagePathInServerDenied=false

It will appear next to the other imagepath parameters and the the default would unambiguously read:
-GImagePathInServerDenied=true

make it clear that the path is not allowed…

To clarify what I had in mind, I was intending to reintroduce imagepath but as a command line option. I think it’s safe to consider anyone passing this has opted into loading files from this path and thus the value of $SERVER_NAME is irrelevant. I.e. you would get the following behavior:

$SERVER_NAME --imagepath=foo attribute imagepath="bar" result
unset no imagepath
unset imagepath is “foo”
unset imagepath is “bar”
unset imagepath is “foo”
set no imagepath
set imagepath is “foo”
set no imagepath
set imagepath is “foo”

This seems to satisfy your use case (--imagepath=foo works universally) while retaining the fail-closed behavior (users with $SERVER_NAME set don’t surprisingly get a lack of sandbox on upgrade).

And as a reminder, I do not consider any of these to be reasonable sandboxing mechanisms. If you’re accepting untrusted user input, we expect you to take your own steps to sandbox Graphviz.

Do I understand you correctly that there will be a new command line parameter --imagepath?
And that parameter is always honoured when it is set?

Yes to this proposal. That will work for me.

and yes to " If you’re accepting untrusted user input, we expect you to take your own steps to sandbox Graphviz."

A comment: In the case quoted below:

I would expect imagepath=foo;bar instead of imagepath=foo. I.e. both foo and bar are honoured.

Yes, that’s what I was thinking.

Interesting wrinkle. I had not actually realised imagepath is a list until now. And yes, I agree with you that --imagepath=foo and imagepath="bar"’s least surprising outcome is probably foo;bar (equivalently, foo:bar on non-Windows).

A couple of things I’m undecided on:

  1. Which order should these be preferenced in? I.e. should --imagepath=foo+imagepath="bar" actually result in bar;foo instead of foo;bar. I think the answer is no. The command line option should be treated as something higher priority that in-graph attribute.
  2. Maybe the “set ✓ ✓” row should also result in foo;bar? The obvious answer is no, but the reason I’m in doubt about this is because this behaviour encourages people to think of this as a sandbox once again. That is, that the command line option is a way of constraining any imagepath-based file reading.

Re 1: In principle I agree it should be foo;bar.
There are a few cases of immagepath=foo + imagepath-bar.
a) --imagepath foo – imagepath bar
→ foo;bar
b) -Gimagepath=foo --imagepath=bar
→ bar;foo
c) -Gimagepath=foo [imagepath=bar]
→ foo;bar
In words:
–imagepath takes precedence over -Gimagepath takes precedence over [imagepath=…]

Re 2: I agree with your obvious answer. No.
–imagepath is set on the command line by the administrator. The administrator can direct a precedence that cannot be overridden by the user.in a server environment. I think even -Gimagepath should be denied. That is how dot currently works. I found out the hard way and that started this conversation :slight_smile:

Regards, Bram

The preferencing is a bit convoluted, but I think I see what you’re saying. What is the use case for passing -Gimagepath=… and --imagepath=… in a non-server environment? We need to define some behavior, but the preferencing you’ve described suggests to me you have some use case in mind for combining these.

In a non-server environment I think the considerations are

  1. backward compatibility. Currently people would use -Gimagepath=…
    That should keep working as is when —imagepath is not used. If —imagepath is used, it can replace the -Gimagepath=. If both are used —imagepath should add something. So I would give it precedence.
  2. consistency with the server environment, where —imagepath takes precedence. That will avoid confusion for developers who will be working and testing in both server and non-server environments.

I don’t quite follow the consistency argument and I’m still confused why anyone would want to combine -Gimagepath=… and --imagepath=.... Isn’t it simpler and more intuitive to just have --imagepath override and suppress everything else regardless of $SERVER_NAME? I.e.

command line attribute imagepath="bar" result
no imagepath
--imagepath=foo imagepath is “foo”
imagepath is “bar”
-Gimagepath=baz imagepath is “baz”
-Gimagepath=baz imagepath is “baz;bar”
--imagepath=foo imagepath is “foo”
-Gimagepath=baz --imagepath=foo imagepath is “foo”
-Gimagepath=baz --imagepath=foo imagepath is “foo”
--imagepath=foo -Gimagepath=qux imagepath is “foo”
--imagepath=foo -Gimagepath=qux imagepath is “foo”
-Gimagepath=baz --imagepath=foo -Gimagepath=qux imagepath is “foo”
-Gimagepath=baz --imagepath=foo -Gimagepath=qux imagepath is “foo”
--imagepath=foo --imagepath=quux imagepath is “foo;quux”
--imagepath=foo --imagepath=quux imagepath is “foo;quux”

I have been away for a while, hence the delayed response.

You are correct that --imagepath will have the same function as -Gimagepath, with one exception.
–imagepath will be searched in an environment where SERVER_NAME exists, whereas -Gimagepath will not be searched when SERVER_NAME exists.

Re: Backward compatibility
No, I don’t think it is more intuitive to just have --imagepath override and suppress everything else regardless of $SERVER_NAME?
The proposal as put will create an issue with backward compatibility. In the example:

With command line: --imagepath=foo -Gimagepath=qux
the proposal is to ignore -Gimagepath, with the result imagepath=foo.

An issue arises when: A script uses dot -Gimagepath=qux.
And a user defines an alias dot=dot --imagepath=foo. The command in the script (dot -Gimagepath=qux) will now only search on the path foo, not qux. The script breaks, unless the images on path qux are repeated on path foo.

The same issue arises with --imagepath=foo and attribute imagepath=bar

I actually use the imagepath attribute. Although I can make the proposed design work, I would prefer this example to resolve to search imagepath foo;bar

My original proposal was to add a switch

That would avoid the issues created by another way to define an imagepath and sidesteps the related precedence issues. That also maintains backward compatibility since the default will be: -GImagePathInServerDenied=true

Re: Consistency.
I think the precedence rules for the non-server and server environment should be the same. The way I understand the proposal is that:
IF --imagepath is set, -Gimagepath and the imagepath attribute will be searched. With --imagepath set, the behaviour should be independent of the existence of $SERVER_NAME. That is “consistent” between the environment with $SERVER_NAME set and the environment without $SERVER_NAME.

I’m getting a bit lost as to what your use case is. You’re running a web server, right? I don’t understand why anyone would be using --imagepath=… in combination with a dot wrapper script that passes -Gimagepath=…. The whole point of the --imagepath=… mechanism is to redirect where images are sourced from. Why would you want some downstream -Gimagepath=… to also take effect?

I ended up posting something that essentially restores GV_FILE_PATH, add a '--filepath=…' option to replace '$GV_FILE_PATH' (!3704) · Merge requests · graphviz / graphviz · GitLab

1 Like