Error using PostgreSQL as database backend with Gramps v5.2.2

In SQLite you can index a field in a JSON document just like a column. This is similar to the way BSDDB can index a field within a BLOB.

I would be against adopting a fully relational model. A document model or a hybrid model would be a better fit for our design.

In both sqlite and postgresql you can store full json documents, and they can be indexed and queried as a normal table with multiple columns.

Here is a simple example for code create by Copilot:

ā€“ Create table
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT,
address JSON
);
ā€“ Insert JSON data
INSERT INTO employees VALUES (
1,
ā€˜John Doeā€™,
ā€˜{ ā€œstreetAddressā€: ā€œ21 2nd Streetā€, ā€œcityā€: ā€œNew Yorkā€, ā€œstateā€: ā€œNYā€, ā€œpostalCodeā€: ā€œ10021ā€ }ā€™
);
ā€“ Verify inserted data
SELECT * FROM employees;


EDIT: Forgot to write that this was an example for sqliteā€¦

1 Like

The key point though is that you can index a field within the JSON.

Can you also retrieve its contents with a normal select, like by naming the ā€˜cityā€™ column from the example? And what will then happen when you have a list of addresses with dates, or a list of names, like we have in the person object?

See the SQLite documentation for examples of using the json_each function. I think thatā€™s what you would use, but I havenā€™t tried it yet.

I understand thereā€™s a lot to do within Gramps to make the change from blobs to Jason, but in the meantime could there be a quick & dirty tool (meaning a simple, stand-alone python program) that converts a Gramps database? Then we could get used to querying it.

1 Like

We already have an export to full relational SQLite. Is that what you mean? It converts 99.9 % of your data, meaning that maybe the favorites are lost, or something else, that I forgot. Your tree data is fully exported.

If you donā€™t, you probably mean a program that replaces all pickled saved with the json.dumps method. Thatā€™s at least what I read about the subject.

I also found that json_each function, but to me that feels like putting the horse behind the cart. and makes me think that the whole conversion to json is a waste of time, because it makes queries way more complex than we can have with a full relational DB, like the one used by RootsMagic.

You also need to read the section about indexes on expressions.

With its JSON capabilities SQLite can be used as a document store as well as a relational database. This hybrid functionality could suit us very well.

The document store features have benefits:

  • The one to one correspondence between objects and rows is maintained.
  • Hierarchical data is stored without the need for table joins.
  • The document approach makes schema changes easier.
  • The same JSON could be used on different backends: SQLite, PosgreSQL, MongoDB, ArangoDB ā€¦

The relational features also have benefits. Even with our data model joins would be useful. For example, a join to the event table from the person table could provide a data for our person views.

1 Like

No.

Yes (I think).

Just want a quick way to convert an existing database into one having all the same tables, with json instead of blobs. It could be good to keep all of the other existing columns, even though some are included in the json, because then we could compare index performance both ways.

I donā€™t think that our objective is to create a full relational DB. If we wanted that we could just use RootsMagic for example. Gramps is based on a hierarchical object model, not a relational one. We could of course create a relational backend, but is that what we really want?

It is clearly possible to write a good genealogy product using either a relational model or a NOSQL model.

I already did this in pull request #800. We were able to compare performance, but most indexes were still unused as they are in our current code.

Enhancing our filters to use the database indexes is a separate task. This can be done with whatever backend storage solution we choose.

And I would probably use that, if my mother language were English, but itā€™s not. And Gramps is better than all programs made in The Netherlands, and also better than other American made programs with a Dutch translation, like Brotherā€™s Keeper, My Heritage Family Tree Builder, and Legacy.

It is difficult to say what ā€˜weā€™ want, if there is a ā€˜weā€™ at that level. I can understand the reasoning behind the architecture that we have, but IMO, it should not block a transition to another model. And I have always liked the relarional model, partly because Iā€™m used to it, and partly because I hate redundancy.

And what is good anyway? Does that mean that the working is clear and fast? Or does it also mean that people of all pursuasions can access the data with their tools of choice? If itā€™s the latter, Iā€™m inclined to think that a relational model is better, but in reality I rarely use the tools mentioned by others.

Right. Iā€™m interested in querying from outside of Gramps, i.e. writing my own SQL queries as I have sometimes done with the current SQLite export. Iā€™d like to gain some experience using SQLiteā€™s JSON functions, so am looking for a quick way to convert an existing database. But if your PR includes a converted version of the ā€œexampleā€ database, that is all I need, how can I download it?

No, you couldnā€™t. RM (and all the other software packages I know of) is basically limited to GEDCOM. That means that it knows only events and roles that are supported by GEDCOM. Gramps is to my knowledge the only software that overcomes that limitation. I could add a lot more, e g regex support, XML export that is an ideal starting point for ā€œpipingā€ to downstream applications and so on. And last but not least: a vibrant developer community instead of being dependent on the fate and habits of a single developer.

Therefore Gramps is much more powerful as anything else on the market, but it is also much slower, and that should not be. I fully appreciate and accept you argument with the object framework you want to conserve and leverage, but there must be a way to combine this framework with modern database backends and to leverage their power. I strongly believe that this is currently the most important strategic issue.

2 Likes

If you follow the link to the PR, you see Nickā€™s fork of Gramps. And if you clone that, and checout the proper branch, you can build and run that from source. Thatā€™s easiest on Linux, but can be done on other OS-es too, if you know what to do.

This branch seems to have been merged with 5.0, so you need to travel back in time a bit.

If it has been merged with 5.0, it should still be in 5.1.6 and 5.2.2, right?

I agree with you that we should work to make Gramps faster. Ideally we should be able to meet the Confucius Challenge, but certainly a tree with 100,000 people is a reasonable target. I only use a tree with 20,000 people for testing.

Hopefully we can leverage the power of SQLite and keep our existing object framework. This is why the conversion of BLOBs to JSON is appealing. It gives us better indexing options without duplication of data and will allow joins.

I would regard querying the database from outside Gramps as a secondary objective.

May be I misunderstand something, but if you already have a fully relational Gramps database in form of the SQLite export (or my PostgreSQL schema), why should you then make your life more complicated by playing around with JSON objects and functions? At the end of the day, a JSON based database will probably always be more complex than a simple relational model. This added complexity may be justified since itā€™s possible to keep the Gramps objects framework as Nick points out, but if you are already beyond that framework in a truely relational data model, then this justification is not valid anymore and there is not need to play with JSON objects.

Fist can I say this thread appears to have gone way beyond the original
discussion

ā€œIt is clearly possible to write a good genealogy product using either a
relational model or a NOSQL model.ā€

The answer to this is yes but reading a lot of the threads GRAMPS is
being used a lot for what I would term marginal genealogy projects
(nothing wrong with this of course) and development is being driven by
the people who need improvements in these areas or are wanting to push
the limits of the software development.

I do not care how the final database backend looks or feels as long as
all the data is understandable to the naked eye via 3rd party
applications. So no blobs!!!.
Additionally I do not care in the slightest if GRAMPS is or is not
GEDCOM compliant as long as there is a means of exporting a basic GEDCOM
compliant file.
Maybe it is time to go back to Basics build a backend that meets the
desired modern requirements and then start again with interface and if
any 5.2.2 programs are suitable for reuse then fine if not dump them.
All products reach an end of life and no amount of tweaking and fiddling
really solves the problem.
phil

No, but you can blame my wording for that. Itā€™s in Nickā€™s own fork, where it seems that he merged 5.0 into that branch, so itā€™s the other way around.

Ah, I didnā€™t know that Confucius challenge but it really sounds interesting. Iā€™m just working on a few tests with a 650k tree (courtesy of Enno) so that weā€™ll get a few hard figures in our hands.

Thatā€™s of course a valid opinion of yours, but this objective will most probably automatically be achieved in the moment when it is possible to use a modern backend without BLOBs. Again the BLOBs are the problem since they are intransparent for everything outside of Gramps.