I’m trying to build an PHP site tool to display a more custom version of my family tree member information (originally built on ancestry) that won’t require a signin or some third party scraping people just wanting to learn more about my ancestors.
One of the tools I’m looking at is gramps for handling the gedcom file exported from ancestry. I like gramps as a tool to ‘clean up’ the data and manage it locally. I’ve already dived into the sqlite file and it has a lot of the information I would want to use, but I’m noticing that some key fields appear to be crunched into the tuple ‘blob’ and I can’t seem to find any easy way to turn that into useable data in php. For example, I can’t seem to denote what ‘type’ an event is as it appears to be encoded in the blob. Thus I can’t easily extract birth and death info even though i have all the records clear as day - as i don’t have any programmatic way to know which one is type ‘birth’ and which is type ‘death’.
I’ve been trying to figure out how to use some the python code thinking I could either extract any useful information from the blobs and add it to another table, or perhaps find some way to parse it directly in php. But I can’t seem to figure out what libraries to call or how to call them (I’m not a python programmer)
I found my way deep enough to find gramps.gen.lib.event.unserialize, but can’t figure out how to invoke it. I tried to see if i could use something like get_event_from_handle() but I’m not sure how to initialize a db object from the gramps config in a simple script.
Can anyone help me find an easy way to access the blob data? I don’t care if I have to build a simple restapi or do a translation-dump occasionally whenever I update the gedcom file. I just want to show basic event information on a given user and I’m about to give up on gramps as a tool with which to do that.
I finally figured out how to unpickle the blob, but not sure what I’m looking at as far as the data.
Is there a way in a short script to use the builtin get_event_from_handle() – and any similar methods for the other object types – without a huge amount of code?
EDIT:
Oh wait, getting further - I believe I found the field that denotes the type, but it just has an integer (and an empty string - I assume this is the __value and __string in the data model on the wiki, but the string is empty).
Is there a reference somewhere to what the EventType (builtin values) represent?
Another alternative is to use the SQLite export addon to export your data from Gramps to an SQLite database that does not use blobs. If you do so, be aware that you need to make some changes to get all of the descriptions. And you would need to use more SQL to join all of the tables.
excellent. Thank you. And dale’s link is just what i was looking for to translate that data. I’ll likely either create a one-shot python script to call for ‘decoding’ blobs to json output or just mass-convert them into an additional table with php serialization instead.
The BLOB column in SQLite TABLEs is just an internal representation (though serialized) of a Python aggregate object. I don’t recommend accessing directly this data as you are vulnerable to any change in Gramps code, e.g. a change in the order of serialization in the various object methods. Presently, database implementations are merely an emulation of BSDDB key/value pair management utility. Only indexes have been added to the records to speed up retrieval.
Regarding the types, they are encoded in the integer value. The names of the constants are defined in gramps/gen/lib/xxxtype.py where xxx designates the kind of object. When type is CUSTOM, the string is used to differentiate user-defined types.
It’s fugly, but it does what I need: (function in my php which calls python to unpickle the byte)
private function pdecode($blob) {
$cmd = sprintf("import pickle; import base64; import json; print(json.dumps(pickle.loads(base64.b64decode('%s'))))", base64_encode($blob));
$pcmd = sprintf("python -c \"%s\"", $cmd);
$result = exec($pcmd);
$resdec = json_decode($result);
return $resdec;
}
(to get the binary data into a command-line form, I use base64 encoding - to get it back out I json serialize the result)