Graphviz Export - Wish for Feature

StoltHD · July 19, 2020, 2:10pm

As some of you maybe have noticed, I use a few other software packages for me research, two of them are Cytoscape and Tulip …
Both of this can read .gv and .dot files.

My question or wish is if it would be possible and if someone that know how to program it, could create an graphical report that contained everything in the Gramps Database with all connections and relations.

The following would be a kind of “Spec”:

People as Nodes (Full name as label, with birth and death data ala “Persson, Johan (b. 1870 - d, 1920)”)
Families as Nodes, with the Surname of the Partners as Labels (i.e. "Family of Jonsson & Larsdatter)
Event as Nodes, with Description or type as Label (and dates if its not possible to define a date range in the format)
Citations as Nodes
Sources as Nodes
Repositories as Nodes
Places as Nodes, with The Hierachy of the Places as a Sub-Graph
Notes as Nodes
Media as Nodes
All type of Relations, Roles, and Types as Edges, this would also include Gramps Internal Links in Notes, with the Name/Description as Label and also with Dates if there are no date attributes in the Graphviz format
It would be great if all Media Files could have the path added as a “hyperlink”, and it would be great if any other hyperlink could be added

…
Graph Software like Gephi, Tulip, Cytoscape,Palladio, yEd, Social Network Visualizer and the new Constellation are great tools for analyzing data, and consolidate data from other places, but all of them need CSV or Graph specific formats …

The GV file generated from i.e. the Relationship Graph report can be open directly in both Tulip and Cytoscape, so I know it works …
…

Another wish would be if anyone could make a full export to a graphML or JSON-LD format …?
I know yEd can import gedcom, but gedcom are an extremely limited format in many ways …
I have a feature request for a “full” csv export, but it seems not many use that format even though its one of the most utilized formats in the “research industry” both regarding DNA and other linked data …
…
Anybody else use this type of tools in their research?

prculley · July 19, 2020, 2:41pm

Given the potential complexity of data attached to persons and families etc., I would think that this would give Graphviz quite the complex job to graph. I would expect that each person would get a constellation of nodes around him, many of them quite large. Similar for Families. And many objects also include notes and citations, so some of the events, and media would get their own constellation.

In my own database, getting the ~1000 persons graphed so that they can be seen is already quite the challenge; I cannot imagine how you could actually use such a graph, even if Graphviz didn’t choke on it. If you are serious about this I expect you will have to learn Python and Gramps ways of doing things, or get someone to do this for you. Good luck.

emyoulation · July 19, 2020, 3:11pm

Another related question:
Has anyone seen a good graphic of such a constellation?

I was thinking that sometimes my filters don’t find everything expected. But manually validating the results is too difficult.

So, over time, I’ve created a few pseudo trees which have examples or various relationship… but with the name and/or ID representative of the relationship (instead of John Smith & Mary Jones) I can merge trees to test my filters against complex criteria.

I was think that if someone had seen a constellation visualization that represented a truly diverse dataset, it would be good to recreate that tree & visualization in Gramps. Then have a filter test that would always have all those nodes in the same spots but light up nodes that the filter allowed.

Something like testing those old strings of Christmas tree lights every year!

emyoulation · July 19, 2020, 3:16pm

And, yes, having more than people as Nodes would be intriguing.

emyoulation · July 19, 2020, 3:33pm

Have you looked through GEPS 30: New Visualization Techniques? Fascinating reading… particularly when you follow the references!

It seems like there are a LOT of academic classes doing programming projects involving Genealogy now. Like the Summer of Code, it represents a great opportunity to get creative expansions of Gramps. Promoting Gramps as a framework for class projects might be an opportunity.

Could be like trying to drink from a fire hose though!

GeorgeWilmes · July 20, 2020, 2:17pm

Yes, as I mentioned in an earlier discussion:

I have experimented with the following approach:

** export to Gramps xml format (uncompressed)*
** use XSLT to rearrange the data into graphml format*
** use a tool such as yEd to view the graphml file*

I would not do this for family tree graphs, since Gramps already has so many good capabilities. Rather, the approach could be used to try other kinds of graphs. Anything can be defined as a “node” or an “edge”. For example, the nodes in the graph could represent people and places, and the edges could represent the events that relate the people to the places. The size, color, etc. of the nodes and edges can signify other variables. Graphs can also be nested.

Tools such as yEd and others will also do some automatic clustering of data in graphs. I have tried using that to make some sense of my DNA matches (importing the csv files directly into yEd).

More recently I found an “R” language package called “ggenealogy” (with two g’s at the beginning) that has some interesting features.The examples in the article are related to other types of genealogy (soybeans and academics) but they would be applicable to family history and things as well. It is not just for visualization, but also computation. Even if you don’t want to use R, it might give you some ideas.

The ggenealogy package uses another package called igraph, for which there is also a python version.

emyoulation · July 20, 2020, 4:55pm

Maybe I’m reading it wrong, but it seems like this idea starts in one direction & takes a left turn.

One is about output formats that would write current graphing object oriented output in another format. That format that is specific to particular chart editting tools. This would allow graphs (that might have a few awkwardly placed nodes in an Gramps generated chart) to be lightly tweaked.

The other is a data export that writes Gramps objects in another node & link file format. This allow another visualization to layout the nodes in network charts that don’t exist in Gramps… or are too complex for any of our current visualizations.

There is an interesting item in George’s comment:

If you’re doing the same XML transforms with XSLT repeatedly, then you should be able write up a detailed list of transforms. Those details would be a great stepstool to an XML dialect Export add-on.

GeorgeWilmes · July 20, 2020, 6:48pm

Here is a subset of the Gramps"example" database converted to graphml via XSLT.

It has only a few node types (person, family, event, place) and only a few details (names of people and places. The edges simply connect the nodes. Much more could be done with the data. But maybe it is enough for you to try loading into one of your preferred tools.

StoltHD · July 20, 2020, 7:14pm

Thanks, I can maybe look at it and see if I can figure out a workflow for doing it …

GeorgeWilmes · July 21, 2020, 7:53pm

For those who may be wondering why bother with another graphing tool, when Gramps already produces many nice charts: another benefit of using a graph viewing program is that it can give you a sense of the scope of your research, in a way that you don’t get just by looking at trees containing only people and families.

In the attached screenshot (which I hope you can enlarge to view), notice the disconnected sets of nodes over on the right side. In this case, they include not only persons and families, but also events and places, that are not connected to the mass of data on the left. (Again, for this example I used the Gramps “example” database.)

Tools such as yEd (which I used here) allow to vary the shape, size, color etc. of nodes and edges based on attributes in the data. So, for example, you could color the person nodes green, the family nodes red, etc. I have not done that here.

Another feature of such tools is to find hierarchies or clusters within the data. And it’s worth repeating that the nodes and edges can represent whatever you choose to put in the input file – sources, citations, anything.

emyoulation · July 22, 2020, 4:57am

yEd looked interesting. Downloaded & installed it — during which noticed that it imports GEDCOM. And that GEDCOM dataset exported from example.gramps was far more complete than the transformed XML file… without any tweaking.

‘One Click’ layout
example2a

Organic layout
example3a

GeorgeWilmes · July 22, 2020, 2:50pm

Yes, yEd is quite handy for viewing GEDCOM files, if you just want to see persons and families as nodes.

To be clear, the example graphml file contains only a subset of the data elements in the xml export, but it does include separate nodes for events and places, thus making the place hierarchy visible. (In yEd, you need to use Edit - Properties Mapper to decide how you want the different node types to be sized, colored, labeled, etc.)

The point was that one can create nodes and edges from any type of data, not just persons and families. The only limit is one’s ability (and patience!) in coding the XSL, and mine is somewhat limited.

GeorgeWilmes · July 22, 2020, 3:51pm

I found this GEDCOM-to-JSON-LD converter. That led me to this site where you can load a GEDCOM file for some interesting visualizations.

Here is someone else’s project. It concludes that it would be necessary to “create a GEDCOM-appropriate @context object” which of course is difficult due to all of the issues with GEDCOM and the way different programs use and extend it. The first converter above makes no attempt at that, but rather “maps a few [existing] ontologies to various parameters in GEDCOM”.

I imagine it could be possible to create a @context object based on the Gramps data model, and it could leverage things like GeoNames. Then perhaps a JSON-LD export could be possible. But I really don’t know much about it.

prculley · July 22, 2020, 9:10pm

You got me a bit curious to see what a JSON-LD might look like after converted from GEDCOM (I understand GEDCOM, and JSON, but never heard of JSON-LD before today). I tried to use the converter you mentioned above, only to discover it has a lot of bugs around importing GEDCOM files. Looks like the author was working with a very specific GEDCOM subset, and did not include code to deal with the many GEDCOM possibilities. In any event it would not convert any of several trees I tried. If anyone gets this to work (or any other JSON-LD from a GEDCOM source), I would like to examine both files to see just how hard JSON-LD would be to create.

StoltHD · August 3, 2020, 5:59am

It’s not gedcom to json-ld or RDS or gml or graphml, but I found this …

A gedcom to graphviz script

I have not tested it yet, but it’s not that old, so it should still work…

emyoulation · August 3, 2020, 2:53pm

Where do you suppose this sort of index to complimentary tools to Gramps should be gathered in the wiki?

The actual nuts & bolts could go in a “How do I…?” Category article. (There’s a template for writing one.) But strategies for using Gramps with other tools in your Genealogy Toolbox is several steps beyond the manual or the various tutorials.

An overview of the tools makes sense. But I’m make sure how to make this advanced topic visible to the people using our documentation system.

StoltHD · August 3, 2020, 5:13pm

I think it should be an “Advanced usage” topic on the Wiki, with sub-topic for different use of the data registered in Gramps…
for example:

Advanced Usage
- External Tools
  - Tools and Software list
    - Graphical Tools (Graphs, Image generation etc.)
      - Software 1
        
        Export Data
        
        External manipulation of data
    - Software 2
- Text based tools (Writer software etc.)

PS. the list is just an exaple to show different topics in a sort of hierarchy, it is not ment as a blueprint…

It can be some different metods thats need to be used to use data in different software, so some kind of a structured “How to” for each would be helpfull…

And maybe some kind of “Show Case” or “Use Case” section, where logged in Gramps users can add images of their own graphs or usage?

emyoulation · August 3, 2020, 9:33pm

Thanks. I tossed together a sample based on the outline you suggested. (I realize that you said it wasn’t meant as a blueprint but converting a sample is easier than building from scratch.)

With this, I can integrate suggestions as they appear… or WikiContributors can tweak it directly.

If there is a suggestion for a different page name, we can re-use the content there. Then use the “Advanced topics” pagename for developing a learning Road Map for growing a person’s skills past what can be done with a Reference Guide: strategies, improving efficiency, growing developer skills, et cetera.

StoltHD · August 3, 2020, 11:35pm

I can give you a bibliography list of approx 2000 different open source software that somehow fall in under this topic

BUT, I will create a wiki user, and see if I can help out a little from time to time… and yes, I will narrow down the list…

There are some 5-10 applications or plug-ins to applications that I think can be usefull, and that I can write a few lines about.
I will start writing the English parts, I’m not sure how usefull it will be to write in Norwegian, since most Norwegians and other Scandinavians that use Gramps do read and speak English really good, and some time in the future the articles and notes need to be updated.
If I figure out how to easily create a Norwegian page for the same topic, it might be that I find time, but I think English should/would be the priority.

emyoulation · August 4, 2020, 1:56am

That would be much appreciated. There’s been some really great info that has been shared in the Discourse forum lately. And capturing that knowledge in a more structured way should encourage targeted use and skills growth. My hope was to do just a few very closely related tools rather than trying to be comprehensive. Those 5-10 you mentioned would be perfect. (The Text editior I added was for MS Windows. Hopefully, someone has a Linux and a macOS option they can recommend.) A sparse framework would give an obvious place to connect new proven techniques as they are revealed.

I keep getting sidetracked by such nuggets of important information with wide application.

There are a few articles of the wiki I’ve partially reorganized and they fall deeper into limbo each time one of these items pops up.

Topic		Replies	Views
How to make visual research of a person/people with with their associations? Help thickets	45	587	May 19, 2024
Printable visualization of the complete family tree Help thickets	29	7298	February 21, 2022
How to edit out the circles in the graph view Development third-party-addon	28	366	June 17, 2024
How to generate a Gramps xml file outside of Gramps? Development	34	1758	April 14, 2025
Some questions of the beginner user that feels confused about look and functionality Help	35	1314	April 12, 2025

Graphviz Export - Wish for Feature

Related topics