Roadmap for v5.3 (correction: 6.0)

There might be a lack of statistics gathering tools that would help isolate performance issues. 5.3 will have the Filter+ features for timing the View refreshes. That’s a start.

Given that some users with larger trees than the 100k you reference are not seeing the same performance issues with SQLite, more has to be done to gather information.

For instance, I can see that large Note content editing is VERY slow. A Paste can take minutes.

On Windows, I had a Place hierarchy that made the Grouped Place Selector grow slower as a session lengthened. (Exporting/re-importing and Repair made no impact.) That slowing did not happen once I hacked the selector to be a Flat list instead of hierarchical. (Thanks again Dave!) Nor does it happen with the same tree on Fedora.

Without isolating the causes, those slowdowns are unlikely to be eliminated.

I actually think that most of the main performance issues have been spotted.
But it will take time to change some of it, e.g., changing from blobs to JSON-objects, creating the correct indexes for the JSON-tables and perhaps querying some database views instead of a direct selection from multiple tables.

But this is changes that also will change some of the workflow of the Gramps code, so to do it without the risk of major rewrite of large chunks of code and the risk of major breaks, might be the biggest job…

When I talk of performance issues, I primarily talk of the performance of searches in the person view. I have seen performance issues in the places view as well when adding a new place to the places hierarchy but this is not something I do frequently. So a performance issue here is of course annoying but not a real problem. But in a large database, one will permanently search for somebody and a performance issue here will basically make it impossible to work with Gramps.

So I’m currently using a database app that is fed from the Gramps XML file to search for people. This database app is linked to the Dynamic Web Report from Gramps. At the end of the day, the result is a read-only version of my Gramps database with the performance of a modern PostgreSQL backend which I can use for searches and similar tasks avoiding the performance issues.

I think I have said this before but:-
I personally think that GRAMPS is at crossroads and needs to decide if
the developers (kind and willing as they are) stop working on tweaks to
the current versions and therefore can spend time on the development of
GRAMPSX which moves the fundamental database functionality into the 21st
century.
Why do we need regular and quick releases of minor new versions I am
happy with 5.1.6 and know and can live with it’s limitations, I assume
people are tolerant of all current and past releases.
So the next survey should be carry on regardless (tweaking) or go for
the big prize.
phil

4 Likes

I fully agree!

And of course I see the point here. It is always very tempting to play around with tweaks since this will have the benefit of “instant gratification”. If fundamental issues are adressed instead, nobody will see anything new for some time because everything happens behind the curtain, and everything you’re doing creates the risk of a major break as StoltHD pointed out.

Whilst I understand and agree with your logic, do you not feel GRAMPS is already creaking and groaning under the sheer weight of tweaks (and add-ons) and that people in general would be quite happy to work with what they have got for some time, what is the worst that can happen you crash your system, reload and reinstall the version of GRAMPS you like and then recover your database from a backup copy.

phil

What I know from discussions here in Germany is that Gramps has a reputation of being much more complicated and consequently not so user friendly as other genealogy software on the market. May be this is what you mean with

already creaking and groaning under the sheer weight of tweaks (and add-ons)

I feel it would be quite difficult to define exactly which tweak and add-on should be considered important and which ones not. But this discussion would only distract from the most important issue we both keep pointing out: how to modernize Gramps in a way that it can fully benefit from the modern database backends we have today. This is exactly the crossroad you’re talking of.

My personal opinion is that this will decide if Gramps has a future or not – the tweaks and add-ons will not …

And to be clear: I don’t want to speak belittling of the excellent work of the developers when talking of “tweaks and add-ons”. But without an appropriate fundament, no tweak or add-on can guarantee the survival of any software on the market. And this fundament is in my opinion

already creaking and groaning

Although I’m still using 5.1.7 (my hack) for a couple of reasons, I see no problem with SQLite here, not even with 600 k persons, from the famous Charlemagne file. No problems means that search times in person view are quite reasonable, and event view works too, although in that file the event person ratio is not as large as yours.

I’m running Gramps in Linux Mint, which is faster than Windows on the same machine, most likely because of the file system, and maybe memory management too. but today I also installed 5.2.2 AIO on my 10 year old laptop, with an i5, 8 GB RAM, and an old fashioned HDD, and even on that, my search times are reasonable, meaning that they’re less than a minute, which is something that I can accept for a database of this size.

This means that on that old hardware, my 5.2.2 with SQLite is faster than your 5.1.6 with BSDDB. How can that be?

May be our definition of “reasonable” is a bit different. A search time of a minute does not fit into my definition if I need the search between 50 and 100 times on a normal working day. That would add up to waiting for 1 - 2 hours just to see the search result when I have search (and presentation) times in my database app with PostgreSQL under one second. So the benchmark should not be what you or I see on our systems but what is achievable with modern database backends. And a search in 600k rows is simply small change for those database backends and nothing a user should be forced to wait a minute.

But at the end of the day, I don’t have any clue. I know from other discussions that search times appear to vary in a very wide range without any recognizable rationale.

When I do a search in RootsMagic, with SQLite, it’s also way faster than Gramps, and when I do it in GeneWeb it’s even faster. But IMO, that is irrelevant for 5.3, because I don’t expect that we can move all searches to the backend in such a small update.

“Less than a minute” is a broad answer.

On complex (multi-pass) filters that will be re‐applied repeatedly, I tend to create a Tag and use the Add/Remove Tag tool to make less complex (and faster) filter.

I have long used yyyymmdd-format, as this sorts in chronological order: very few genealogical applications allow this Gramps being one of them, Reunion (macOS only) being another.

There are typefaces that have all of the digits alligned and which aren‘t monospace, the old macOS Lucinda Grande being one.

Right. “Less than a minute” may be quite ok if I do the search once or twice a day. Then I will use the time to get a coffee. It’s completely pathetic if I do the search 50 - 100 times a day. So my point is that we should not discuss performance in terms of who sees what on his/her machine with his/her database, but in terms of what’s possible with modern database backends today. And this probably means that RootsMagic and GeneWeb are the benchmarks we should set for Gramps.

1 Like

Why discuss single view performance, when it all comes down to how Gramps use the database backend?

It is no point to hack some list-order code or a cache parameter (yes, it can be a temporary workaround) if you want the performance to be better on the long term.

BUT, to change from one database system to another takes some time and either the developer doing the job, just has to make a totally new version and let the old one just “run and float around”, or they need to do as they have done with Gramps, do it in steps.

Gramps have now changed the main database-backend, the next step is naturally to change from blobs (no point to use in a relation database unless it is extremely large binary data) to JSON or to spilt the data into multiple tables (actually no need to do so with the JSON support in sqlite and postgresql and most other relation databases today).
The next major step after the change to JSON objects stored in a JSON table, would be to create some indexes and views as a minimum, and maybe also som functions for handling the most common tasks, e.g., some of the simpler searches/filters etc. and change the Gramps code to utilize those features in the database backend.

there is no need to create the most advanced SQL-functions and stored-procedures and to split up the serialized objects, to speed up most of the main functions/features of Gramps, and by keeping it simple, it will also be possible to utilize much, if not all, of the same code for different backends, e.g., earlier mentioned multi-format databases that do support much of the common SQL language…

One of the reasons for why it is not necessary to split the JSON-objects stored in a JSON-column is because of the way we use attributes, with a JSON object we can store the attributes for an object dynamically, just as in a document- or graph database, instead of having a table with attributes and multiple “reference tables”.

So, when you save a JSON-object to the database, one object can have 50 attributes, the next same type object can have 10000 attributes and you will not need to change the database structure in any way, e.g., extend a table with more columns.

Two other features that can be used to really speed up things is multi-thread and multi-CPU usage… Not linear, but the benefits can be huge for the right type of data, I/O and data-processing…
Dream situation would be to have a configuration like:

  • How many threads =
  • How many CPUs/Cores to use =

That way it would have been possible to test out the best scenario for any hardware combination.
But seriously… who have a billion entries in the database… and absolutely no time to use?

My 2c are an included ‘Card View’ (the experimental extension crashes Gramps occasionally now), and input enhancements; way too many mouse clicks are needed now to add a person with name, birth, places etc or change that data. A ‘Form’ style should be the default.

On Win11, Gramps 5.2.2

Can I put in a plea for Feature Request 0010550: Refactor Gramps GEDCOM import so as to support GEDCOM extension addons.

This would greatly simplify implementing support for various different GEDCOM import features.

See also: Re: [Gramps-devel] Gedcom 7, any progress? | Gramps
Nick wrote:

On 05/10/2022 14:06, <hidden> wrote:
I would really love it if the first step could be to modify Gramps Import so that it accepts plug-ins.

I have used the plug-in architecture for Gramps Export to implement an export for The GedView mobile app (which is excellent and includes all my media evidence of BMD and Census data), and have also done a small part of export to Heredis. But of course there is no plug-in architecture for Imports. Once that is done, maybe an import plug-in for Gedcom 7 could be done.

Yes. I totally agree.

A new Gedcom 7.0 import would probably want to do something slightly
different from the current import whilst still sharing most of the code.

Nick.

3 Likes

When GEDCOM7 was brought up in MantisBT, @prculley pounted out that a number of Gramps data model changes were going to be needed.

If GEDCOM7 (uncompressed and ZIPped with Media) can be released as Importer and Exporter add-on plug-ins, then maybe those core data model changes should be the focus on the roadmap and have the GEDCOM7 plugins in the 5.3 addon queue?

It would be really nice to have unsupported GEDCOM Tags in a structured custom attribute in adition to the Note. (Because the note has an error message that complicates parsing it as a SuperTool post-process.)

I have a lot of segment triangulation data for many of my DNA matches. I have created a parallel Gramps “Triangulation” database in which I use the DNA Association functionality to represent just the triangulated segments.

However, it appears that only two-person DNA Associations are currently possible. There are many triangulated segments shared by three or more individuals. I’d therefore like 5.3 to allow > 2 person associations to be created.

Also having a separate database is a pain to keep synchronized as to People, Families, etc. I’m not sure there is currently any way to sync the two databases such that the “regular” DNA Associations in the main database won’t interfere with my separate Triangulation segment Associations. I’d like 5.3 to allow all types of Associations to reside in the same database

2 Likes

One more interesting feature for future Gramps versions is about Timelines. I surfed gramps discourse for something similar and was pleasantly surprised that not only me is interesting in this tool and that this is already under developing:

It would be great to have it in 5.3 if possible.

And also I would like say thanks to all developers who makes the Gramps application better. You make amazing and very important job!

1 Like

A big up-vote for grouping of sources.

I imagine this as more of a view thing, than a model/database thing, although probably needs some storage for optimisation. So grouping keys kept separate from the current source record.

Eg a tree-view that groups a number of sources with a shared initial part.
It is most useful when there are many sources with a shared initial part which is long enough to be meaningful.

Good if this grouping can be auto-detected, but still useful if it is manual.