Performance Issues with GRAMPS

So Nick my input
Statistics
Number of Individuals 15365
Number of Families 4512
Number of Unique Surnames 2581
Individuals with media objects 6641
Number of media objects 12686

GRAMPS: 5.1.6
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [G…
sqlite: 3.37.2 (2.6.0)
LANG: en_GB.UTF-8
OS: Linux Mate on Ubuntu 22
Distribution: 6.5.0-28-generic

Only one issue Narrative Web for well known reasons other than that no requirement to improve performance
phil

are you talking about:

Yes that is the one not got any figures for doing that but will report back

The performance issues that aggravated me have been:

  1. the initial use of Gramps on a fresh install on a new machine takes a LONG time to make all the _pycache_ folders and register built-in plugins with NO feedback.
    The time for the process is not excessive. But the appearance of nothing happening can lead to killing the process or double-launches. Can Gramps have an initialization dialog that reports stage progress?

  2. Similar to #1, the Importer plugins need a Status dialog (or Statusbar) progress feedback. This issue is unlikely to affect power users … they are less likely to do large imports than new users. (Note that the niceties of post-import status dialog and automatic tagging and/or sourcing of imported data is not available for ALL importer plug-ins.)

  3. On Windows with a HDD (not an SSD), the populating of a large complex Place hierarchy into the Place selector had unpredictable performance. It seemed to get slower as sessions grew long. Hacking from a Hierarchical to a Flat tree Object Selector eliminated the problem. I’ve had no problems (even with the save version of Gramps) wince switching to a Fedora box with SSD.

Every time I add a citation, when the window opens, it loads all the sources/citations which for me is getting to be 2 seconds before I can do anything. The equates to more than 400 seconds a day that I have lost in productivity. There must be a better way to do this.

1 Like

Here are a few things that I know:

  1. GEDCOM import is slow. You mentioned that earlier, and I feel kind of lucky that I imported my Charlemagne GEDCOM years ago. It takes a few minutes in PAF and RootsMagic, hours in Gramps.
  2. Check & Repair is slow on large databases too, so I avoid that for those. And on such databases, finding duplicate persons is so slow, that it is way faster to export a GEDCOM to RootsMagic, and start a duplicate search in that. And that includes the time to do the import in RM.
  3. The performance of the Deep Connections Gramplet is the worst of all, and unlike the GEDCOM import, Check & Repair, and the Duplicate Person Search, there is no progress bar, meaning that it looks like Gramps is dead.

My general impression is, that writing performance on Linux is much better than on Windows, which explains the Windoze slur quite well. This is something that is visible during the creation of the cache folders, one time, and during GEDCOM import, every time. I see no significant difference in reading speed.

2 Likes

I once tried some changes to the people category view to include a column for the relationship to the home person. The view came up quickly enough but was very slow when filtering, and hung when sorting. Maybe that’s to be expected since it’s not a trivial calculation, or maybe I could have coded it differently.

Otherwise, for my database (which I feel is probably on the smaller side at fewer than 6,000 people) the only performance issues I see have more to do with the human time required for me to navigate through all of the various selectors when doing data entry, and I think there are already outstanding feature requests about those.

I was going to suggest that people include in their posts a copy & paste from the Statistics gramplet, but I notice that it counts only people, families, and media objects and not events, sources, citations, or places.

1 Like

There is an Object Selector GEPS 041.

It suggests a couple alternatives with lower overhead to populate. Such as a last 10 Citations/Sources list. (Reasonable, considering that you’re likely to use the same source and page when you’re doing data entry in a session.) And a persistent recollection of the Object Selector subset that you prefer.

Oh, YES !!!

Deep Connections running in the sidebar or background absolutely kills Gramps performance and makes it crash a lot too.

And the various Gramplets that do a Database-wide analysis in the Dashboard are also killers. But they are so much less useful than Deep Connections that the overhead goes unnoticed. After all, learning which Surnames and Given names are most common in the Tree is pretty useless. Learning that for a particular Pedigree or Line of Descent would be more interesting… but the Gramplets cannot DO that.

@cdhorn built some Statistics Dashboard CardView test balloons:

But he pulled back on them due to the Performance hit. The interactivity was wonderful for troubleshooting outliers. i.e, perusing the “Unique surnames” often turned up typos.

Just the number of people in the database would be helpful to start with.

Win10 with 5.1.6
6600 persons and 18,000 source/citations
Only a couple of hundred media.

1 Like

Adding a place enclosed by a place with many enclosures (in my case Wiltshire). The time for the process is not an issue because adding such a place is infrequent. But the appearance of nothing happening can lead to killing the process.

16458 people
7176 places, about 400 in Wiltshire

GRAMPS: AIO64-5.2.1-r1-cdec38e
Python: 3.11.8
BSDDB: 6.2.9 (6, 0, 30)
sqlite: 3.45.2 (2.6.0)
LANG: en_GB.UTF-8
OS: Windows

Not performance but time spent on data entry and hopefully updates to
the Form Addon might help. I import very little (my choice largely) so
most things are transcribed.
phil

Yeah, I also count on it a lot. It’s the place where the most time is spent.

I did all the following speed tests on a Win10/64 machine. The BSDDB database is on a SSD drive, the SQLite database for RootsMagic also.

image

0 in the table means that the search result was displayed (nearly) instantaneously so clocking the time was not possible (and would be quite irrelevant).

Another observation is that the place selector (7284 places) will be populated nearly instantaneously with 5.1.6 on BSDDB, but needs 1 - 2 sec with 5.2.2 on PostgreSQL. This observation matches the speed tests quite will. I have never noticed any significant performance difference between PostgreSQL and SQLite but I have no data for SQLite.

The situation where the speed problem of 5.2.2 is most annoying, is clearly the search for individuals. 5.1.6 on BSDDB is not really good (so I developed a solution outside of Gramps) but acceptable, but the factor of 4 for the time needed with 5.2.2 on PostgreSQL is simply inacceptable. I guess a quick work-around would be to make BSDDB available with 5.2.2 under the assumption that 5.2.2 will then be on the same performance level as 5.1.6.

And my personal opinion on the 13 sec with RootsMagic and the 640k dataset is that this isn’t really good either. A dataset of that size should be small change for SQLite if the indexes are well done, so I guess that the problem is located somewhere in the program code of RM.

1 Like

Very interesting how different users have completely different work styles. I’ve never touched the Form Addon over all the years I’ve been using Gramps now …

It’s true ))
Me too did not use Forms yet. But I feel how much time I spend on routine mechanical actions which could be automated

What is that search exactly? When I type ‘enno’ in the person filter here, my Gramps 5.1 needs 20 seconds to show all persons with that substing in their name. A similar filter in RM 9 needs less than a second, but only shows persons with that string in their surname. Typing ‘,enno’ shows similar results for given names, in a flash.

Finding substrings like this probably means that both programs need to do a full table scan, but some of the data may already be in cache, because both programs are designed like that, so that the person views are as fast as possible.

I’m running Gramsp 5.1.6 and RootsMagic 9 both on Linux here, Mint 21.3, on an HP Envy with an AMD Ryzen 7, 16 GB RAM, and an Intel SSD.

Exactly that. Simply a full surname. Are you using the 650k dataset or another one?