Full Tree of Life

Podbrushkin · August 29, 2025, 1:48pm

I have used “-Ksfdp -GK=0.2 -Goverlap=true” to successfully layout all known life (4.5M taxa) in just 40 minutes, but I seek for improvement.

If you have any suggestions on how to make small branches more readable and whole tree more distributed - please share. Reducing K factor increases branch overlapping.

Full Tree of Life

magjac · August 29, 2025, 2:20pm

Wow! Your application is amazing

Podbrushkin · August 29, 2025, 2:38pm

Thank you!

steveroush · August 30, 2025, 4:04am

Indeed, most impressive. Most impressive.
I have no direct experience with graphs with millions of nodes, and no useful experience with sfdp. That said, here are some of the attributes I would experiment with:

beautify
overlap (a guess: overlap=prism or overlap=prism3 )
overlap_scaling
overlap_shrink
sep
esep
smoothing
levels (maybe)
voro_margin (maybe)
quadtree (maybe)
repulsiveforce (maybe)

For completeness, you could use ccomps to break the graph into non-connected subgraphs; layout those graphs using the engine of choice (sfdp?), and then use gvpack to combine them back into a single graph. (Not sure I like the idea, but it but you never know until you see the result.)

scnorth · August 31, 2025, 1:51pm

You are at the level scale that was explored by the author of sfdp, Yifan Hu.

See his gallery, Visualizing large graphs: graph visualization of matrices from the SuiteSparse Collection

Many examples are under 100K nodes, but his website says “the largest graphs have tens of millions of nodes.” Yifan doesn’t hang out here, so you should contact him at his professional website Professor Yifan Hu

He’s really positive, and knowledgable about this topic.

oliversalzburg · September 2, 2025, 10:00am

I absolutely love it! I have this poster framed on my wall for many years now, and I’m still in love with it. Your graph satisfies on a slightly different level

Podbrushkin · September 8, 2025, 9:11am

Thank you, right now I feel like only -GK and -Grepulsiveforce affect layout in somewhat desired way. In Gephi, Yifan Hu layout accepts more parameters, I will try it too. It seems like Graphviz’s sfdp finds optimal position for all nodes before applying force-directed algorithm, which leads to much better (and consistent) results, while in Gephi initial positions are random. I am already working with connected components one by one, but I’m using my own wrapper around Gephi Toolkit to deal with them.

Podbrushkin · September 8, 2025, 9:22am

I’ve visited this Gallery multiple times, but those pictures don’t give any useful information to me. All datasets and pdf’s are “File not found“, there’s no description on what those graphs represent and command which have been used for layout is not provided either. I think it would be cool to have similar gallery for Graphviz with examples of point-only graphs and how different layout parameters affect output.

Podbrushkin · September 8, 2025, 9:45am

Thank you. I think I have seen this chart and its interactive version. I like how all branches point outside of a circle, it gives more tree-ish look. Also, I wonder how all those micro-braches have been drawn.

I have tried twopi layout, but even for NCBI database (2.6M nodes across 38 depth levels with highest count of 389K at depth 10) circles become too big. It would be cool If it would’ve been possible to tweak a distance to the center of nodes at same depth. E.g. nodes at depth 5 placed at radius = 0.95x, 1x 1.05x.

Or maybe another way to make all branches point outwards.

oliversalzburg · September 9, 2025, 10:37am

When a friend of mine, with a background in biology, saw the poster hanging in my apartment, she suggested that it’s not scientifically accurate. She gave me the impression that it was more artistic than scientific, which I could make my peace with. At the time, I hadn’t even questioned if the tiny branches have any accuracy, even at the time of publication. I always assumed someone drew them semi-randomly by hand. To me, it serves as a reminder of the richness and diversity of life. Alternative visualization just contribute more to that

scnorth · September 10, 2025, 12:01am

Understood, though how can they know what is “accurate” at a detailed level.

Topic		Replies	Views
Large tree visualization Help	2	131	August 29, 2025
Layout algorithms Help	5	17321	April 25, 2020
Drawing large networks Help	2	971	August 2, 2020
Recommend better way to show data graphically Help	9	2368	October 3, 2023
Segfault during a rendering of a tree Help	8	141	February 20, 2025

Full Tree of Life

Related topics