Slow people and citations filter searches

GRAMPS: 5.2.1
Python: 3.10.12
BSDDB: 6.2.9 (5, 3, 28)
sqlite: 3.37.2 (2.6.0)
LANG: en_US.UTF-8
OS: Linux
Distribution: 6.5.0-44-generic

I tried to find any posts about “slow search” but not found. Looks like we already discussed mothing similar. But maybe it was another thing. I have a small DB. It has 11000 people only. This is nothing for modern computer powers. I think my PC is modern or at least it is quite good to be fast )). But each search takes almost 2 seconds (1.76 sec). I think this is too much - search is very often in use during a day. What if anybody has more people or more poor hardware?
Siimilar search time I have on Citations page also.



Im sorry if I duplicated the post.

Have you tried the hack @Nick-Hall suggested that increases the size of the SqLite cache? Nick has been asking for feedback and there hasn’t been much.

1 Like

I will try and answer there. Thank you!

1 Like

That topic is already closed.

I tested it in on Ubuntu 22.04. The search time was decreased from 1.76 sec to 1.3 sec.

Screenshot from 2024-08-11 21-22-39
Screenshot from 2024-08-11 21-27-51


Screenshot from 2024-08-11 21-28-42

To make giving feedback easier, you might want to swap out the Isotammi Filter+ addon gramplets for the Filter built-in gramplets.

They have a display feature to do the timings.

(Swapping gramplets in all the varios categories is tedious. @kku wrote a SuperTool script for adding the Isotammi coloection to the Addon Manager project list. A script to put Filter+ anywhere a Filter gramplet exists would be even more useful.)

How can I do this? I dont use “Filter+” tab, I use the “Filter” tab. Or maybe I need remove “Filter+” addon?

The “Close” gadget for Gramplet tabs is disabled by default in Preferences. After you install the Filter+ gramplet with the Addon Manager, change that Show close button in gramplet bar tabs General tab preference. Then close the built-in Fitler gramplet that is the 1st tab in the sidebar of each category. Add the Filter+ gramplet from the Gramplet bar menu. Finally drag the Filter+ tab to be the 1st tab in the sidebar’s titlebar.

1 Like

Got it, thank you. I did it.
Now I see 1.32 sec - which is almost the same as the prev measurement.

1 Like

another measurement with less DB


Screenshot from 2024-08-11 21-53-36

looks like each 5k people add 0.5-07 sec to search time )))

One thing to be aware of is: multiple terms in the Name field causes a HUGE performance hit.

That is because the filter searches each term separately. Plus, there are SO many name fields to search.

Serge @SNoiraud has shared some tips in the past to make it more efficient when you filter. It is one of the reasons that he recommends using Regular Expressions in Filters.

Which is the same time to take a sip of coffee (or tea, or beer or wine). Relax!

I might be wrong, but such a significant search time can only indicate one thing: the search is not performed by the database, but by loops in Python code. And with 11,000 people, 11,000 queries are being made to the database. I don’t know how exactly this database works, but in MySQL, for example, such a search would be done with a single query in just a few microseconds. I assume that currently, the Gramps core does not have an API for such a search in one query, but it does have an API to get the names of each individual separately. Is this how it works now?

I can’t take 300 sips of beer or something else throughout the day just because the program works slowly. I agree that I can wait 1.32 seconds. But I raised this issue because my database is growing rapidly. And if I see a slowdown in search today, in 2-3 years, I’ll have to go for a snack or a walk instead of just taking a sip of beer. If this is really what awaits me in the future, then all the flexibility of the Gramps application will be of no use to me. I’ll be forced to look for another application and spend a lot of time transferring my database there and adapting to another functionality. Right now, I have a strong feeling that Gramps is an extremely flexible application but only for small databases.

Keep in mind that the pace of data collection is increasing, as different researchers are combining their research into one large tree. We are also on the verge of a new stage of technological revolution. In a few years, all our documents will be read and analyzed by artificial intelligence, which will be able to independently build family connections. This means that our databases will quickly be filled with family trees of entire settlements. If this happens, Gramps will not be able to compete due to low performance. This is exactly what I am afraid of. And when I think that I will have to look for some other program, I feel like crying because Gramps is the best software product for me at the moment. I ask all of you not to overshadow all its advantages with a single performance drawback. And yes, 1.32 seconds for just 11,000 people is indeed a lot of time for the modern world.

Gramps is a unique application. It has the flexibility that I love and that made me choose it as a user. All of this has been made possible only by the developers who have invested a lot of their time, for which I am very grateful, as not everyone is willing to make such sacrifices, including myself.

Whilst I am mostly in agreement with the points you raise

I disagree on two area

  1. Merging GEDCOMS (no matter how impeccable the source) is just merging
    their errors into your errors and then spending time finding them.
  2. AI transcribing of documents you think human transcribers can make foopah’s wait till you see the high speed version.

What is being described is no longer Family History it is Data Harvesting.

Family History requires time, patience research and evidence. Data Farming requires none of the above merely acceptance of large scale data junk.

phil

1 Like

Yep, you’re wrong about the assumptions.

The gramps database model is not what you described.

The filter gramplet is not a single Query. It is a stack of queries against different tables of data. Where the Name, Place and Date queries are the most complex.

And since the all the tables of the entire database aren’t held in memory at the same time, the expectation of any search taking microseconds is unreasonable.

Got it. I’ve serfed internet a bit. All TOP popular GRAMPS analogues use Sqlite which doesn’t have the problems above. Of course, my words dont have a weight here, but I think non-relational DB is not the best solution for such difficult and multifunctional application as GRAMPS. It doesnt have a future with BDB. Remember Windows and disks sizes 10 years ago and the same today. Each of us already has terabytes of data. Something similar happening in genealogy, but you guys refuse to believe that Gramps users can have not only one family. Any genealogs probably want research while towns (all the people in a towm). They want merge data with other genealogs. The merging data quality is the second question. Users anyway will merge data. And artificial intelligence even now generates not data junk (but its only beginning). Several years more and artificial intelligence will make our research no worse than we do it.

Gramps has used SQLite since the 5.0 version. It became the recommended db engine with the 5.1 release. If your trees are still in BSDDB, conversion is recommended.

Mhm, so looks like I use sqlite (I see it in the grampsdb directory).

Our default backend is now SQLite, but our database plugin framework allows you to use any database of your choosing.

We dropped support for BSDDB in v5.2.

The future will probably involve a move to the newer generation of hybrid databases. Genealogy data could clearly benefit from a graph database, but I also see aspects where document or relational storage could be useful.

Gramps has an object based design and this is unlikely to change. Our filters allow complex queries to be written, but they are not the most efficient for simple queries.

If already sqlite, then I dont understend why is it slow. My DB has 36 Мб only. Now on one of my projects I work with 3Gb database. Some tables have 1 million+ records. And I can use JOIN and so on. And it is still fast. But 36 Mb - this is really nothing.