Print only specified node and it's dependencies from large graph

Knaldgas · January 28, 2022, 8:52am

I have a tool that generates DOT-files representing package dependencies for a large project. Brainlessly generating a pdf on that graph gives a pdf with a size of 26 x 0.5 meters o_O

As a first step I would like to generate graphs showing only a specified node and all of it’s dependencies (sub-set of the main graph).

How can I achieve that?
From what I’ve found so far, gvpr seems to be a good bet, although I’m not sure where to start with gvpr (I probably just need the “keyword” to find the obvious?)

mark · January 28, 2022, 10:28am

How are you generating the original dot file? Might be easiest to filter there. There is some prior art here for software dependency graph querying, for example
https://docs.bazel.build/versions/main/query-how-to.html.

scnorth · January 28, 2022, 1:32pm

The gvpr man page includes an example that “Prints the depth-first traversal of the graph, starting with the node whose name is ARGV[0], as an indented list.” You could modify this to put the results into a node-induced subgraph and print the subgraph (so the output would include any attributes on the input objects) or if the graph doesn’t have attributes, just print the result in graphviz format.

steveroush · January 28, 2022, 8:25pm

Here is slightly-modified and lightly-tested version of the “cliptree” gvpr program that comes with the Graphviz package source. It is something of a mystery to me, but it seems to work. Save this file as subGraph.gvpr.

/* Construct subgraph reachable from node ARGV[0] by forward edges */
BEG_G {
  node_t r = node($,ARGV[0]);

  $tvroot = r;
  $tvtype = TV_fwd;
  //print ("//  Staring node: ", r.name);
}
N{
  print ("// subgraph node: ", $.name);
  $tvroot=NULL;
  subnode($T,$);
}
E{
  print ("// subgraph edge: ", $.name);
  subedge($T,$);
}

It needs a dot-language source file as input. Change your generating program to produce dot output format instead of pdf.
Here is the (Linux) command line:
gvpr -f subGraph.gvpr myfile.dot > subgraph.dot

Other thoughts:

Wow, surprising that pdf will support such a large file
How big (nodes and edges) is your graph? (gc myfile.dot will give node & edge counts)
There are some tricks to reduce the physical size of a output graph, but maybe not enought for this graph.

Knaldgas · January 29, 2022, 6:16pm

Original dot-file is made by 3.-party tool. I do have source, but in my mind this must be a post-processing job. The language is Ada btw.

Knaldgas · January 29, 2022, 6:20pm

Thanks, I will try this as soon as I get back to the office.

The output is already dot, the pdf was generated by graphviz.

I don’t have the code right here, but I believe there are about 700 nodes (don’t know amount of edges). I think 4-500 nodes have the same rank.

Yes, making sub-graps and clusters are likely the next step.

scnorth · January 29, 2022, 11:52pm

It’s reasonable to extract a smaller graph from a larger one.

You can see the need for better interactive tools for exploring large(r) networks. If this were a tree, one would probably use one of those tree walkers in d3.js

It would not be a terribly huge leap for us to do this.

Stephen

mark · January 30, 2022, 12:06am

I agree, as a strawman, I feel it should be possible to adapt the general bazel query language to general graphviz graphs, so long as there are ways to identify nodes. This is just the query language I have the most experience with.

Or maybe the graph databases have a query language. Or maybe GraphQL.

steveroush · January 30, 2022, 6:13pm

Can you provide your source Graphviz input?
Also there is a Graphviz program unflatten (https://graphviz.org/pdf/unflatten.1.pdf) that might helpfully rearrange your result.

Knaldgas · January 31, 2022, 3:43pm

640 nodes and 4600 edges gave the 26 x 0.5 meters pdf, but found that I hadn’t processed all…
The total added up to 1170 nodes and 11269 edges, generating a pdf at roughly 63.5 x 1.1 meters pdf.

steveroush’s gvpr program makes it possible for me to focus on dependencies for individual applications (the example I just used reduced the graph to 87 nodes and 368 edges). I will now work on collapsing related nodes to simplify the graphs further.

I have put an anonymized version of the graph at https://pastebin.com/43MRvjsq (for a week)
Length of labels are the same, but anonymized.

Thanks for all your help

steveroush · January 31, 2022, 6:46pm

35 ranks, lots of edges per rank.
On average 20 edges for every node - makes for a full graph.
dot will run faster if you add this to your commandline -Gnslimit=2 -Gnslimit1=2 -Gmaxiter=5000 -Gmclimit=.4

MathMonkeyMan · August 11, 2024, 4:54pm

For anyone coming to this thread from a google search, as I did, here’s my annotated version of @steveroush’s script that might help some understand how it works (that and reading through gvpr’s man page):

// Construct subgraph reachable from node ARGV[0] by forward edges.
// This is from <https://forum.graphviz.org/t/print-only-specified-node-and-its-dependencies-from-large-graph/1011/4>

BEG_G {
  $tvroot = node($, ARGV[0]);
  // Depth-first traversal following forward edges only. Starts at $tvroot.
  $tvtype = TV_fwd;
}

N {
  // Setting the root of the next graph to NULL means "stop after we've
  // processed the current graph." This behavior is part of $tvtype.
  // We set the root to NULL multiples times -- once for every node -- though
  // setting it only once would suffice.
  $tvroot = NULL;
  // Add the current node ($) to the target subgraph ($T).
  subnode($T, $);
}

E {
  // Add the current edge ($) to the target subgraph ($T).
  subedge($T, $);
}

steveroush · August 11, 2024, 5:32pm

To give credit where credit is due, I just tweaked the cliptree script in the repository. I believe that @erg wrote these scripts as well as gvpr itself.

Topic		Replies	Views
Filtering a dot file to only nodes and edges of a subgraph? Help	6	1904	November 17, 2022
Python's graphviz module	0	68	February 6, 2025
Segfault during a rendering of a tree Help	8	54	February 20, 2025
Looking for tools outputting DOT files Help	7	167	October 20, 2024
Creating a DOT graph with thousands of nodes Help	2	3491	March 27, 2022

Print only specified node and it's dependencies from large graph

Related topics