Processor Affinity for Gramplets on multi-core?

Would it be possible to give Gramplets a Processor Affinity preference?

I understand that multi-threading to effectively use more cores to speed up Gramps is problematic because of sequential scheduling dependencies. But aren’t Gramplets mostly parallel processes that are of secondary importance? If their CPU load was effectively removed from the CPU running the Interface & primary Gramps functions, wouldn’t the slowdowns due to Gramplets be minimized? Effectively, it would speed up the responsiveness of Gramps.

No doubt there would be some growing pains: perhaps we’d learn that data import & data entry Gramplets couldn’t be multi-threaded but Reports & Quick Views could. (Which might require some re-categorization of Gramplets. Deep Connections is a more of a Report than a Gramplet, isn’t it?)

Wouldn’t multi-access to the database be a problem ?

Does that need to be a consideration if the process opens in read-only mode? That might be a way to distinguish between Gramplets that can be multi-threaded & those that cannot.

Gramps was never planned for any kind of multi-user or multi-process access. So unless the whole thing was opened read-only, we could get failures if one process writes anything while another process reads. I don’t think making Gramps faster and read-only is worth much.

To do this right takes a lot of work, at a minimum I think we would have to lock the db at the transaction level, and do a lot more to make sure that views/gramplets were updated somehow in a coherent fashion. Gramps has a lot of structures that hold db data that would fail if they got out of sync; the current system of signals and callbacks tries to make sure these stay in sync, (and sometimes fail generating crashes). I shudder to think what it would take for totally asynchronous changes to avoid problems.

I don’t see that as happening soon, unless a truly experienced programmer in these issues can come up with a fundamental plan on how this might be done.

1 Like

Sqllite do have buildt in support for multithread, same with all the “big” database servers…

for python, both “db-api 2” and “Psycopg2” also support multithread connections…

But that’s not the same as it being a easy job to change Gramps to benefit from it…

And another question would be, what prosesses would beneif from true multithread in Gramps?

First of, it would help anything if you have your data on a harddrive, its not sure even a sata ssd would give enough read/write speed to benefit from multithread database connections…
… and then its the Gramps it self, what prosesses would benefit from MT? I know drawing large graphs and diagrams would gain from it, if the libraries used utilize MT (most python 3 do i think)…
maybe search, but then again, that would depend on the database and database connection and the read/write for the hardware

But things like adding information, writing text, adding media, adding map points, will it benefit from it at all?

What do people use that actually would benefit from full MT in Gramps?
What gramplets are used that would benefit from this?
This is the question we need to answer before we ask for developers to do that kind of changes…

I think that for the graphic part, there would be some benefits from both MT database connections and MT prosesses in Gramps (i have never even thought of if Gramps Python scripts do use multple threads when executing)…
I use Gramps on Windows so the biggest issue I have is the initial startup time (most likely the time it takes to read the zip files)…

I’m in the route of getting a new workstation now, where Windows will be installed on a pci-4 nvme ssd, and all databases, postgresql, mongodb, and sqlite will be installed on a 4x nvme ssd 4TB RAID0, when that system are up running in a few month, i will write a little report about time differences when using my near 800k place database, both in sqllite and mongdb…
the system will build on amd ryzen 7 or 9 and with a minimum of 64GB ram (most likely 128 GB)… and the system I have today is a intel i7 4820K with 32GB ram with SSD’s for OS and Databases on different SATA SSD’s on SATA 3 connections, so this upgrade will be something that would be a “normal” upgrade for many people…
A 6-10 year old computer updated to a new system, even though most people maybe will go for a Ryzen 5 with 6-cores…

I will also build a Epyc Workstation, but that will not happen untill there are some full featured pci-4 Motherboards out there (at the moment its only one and its near impossible to get one here in Norway)… but I will of course test the speeds on that also when its up running… (don’t think it will have any speed improvements from a “normal” pc build on latest Ryzen or Intel arcitecture)…

1 Like

I was thinking of the following candidates for pushing to a different core thread:

  1. Deep Connections Gramplet
  2. Pedigree Gramplet
  3. Media Preview Gramplet
  4. Fan Chart Gramplet
  5. Quick View gramplet
  6. Gallery Gramplets

They all add a heavy CPU burden spike to Gramps as you change Active Person and they lag behind navigation. None allow data editing. And they could all be more interruptible when the Active Person focus changes.

I’d MUCH rather see the primary core be more responsive to keystrokes & clicks in the interface than be diverted by these wonderful but burdensome Gramplets.

When I was running beta testing cycles, we always had users & programmers do rotations on the slowest machines available. It gives a person a different valuation of what is a worthwhile optimization. And enduring that pain helped reduce the pain for EVERY user.

1 Like

Taking Deep connections as an example; It does an enormous amount of db access behind the scenes, basically doing a tree search of your db to see if it can get the two endpoints to meet. The db in turn, is doing a lot of disk access to get the data requested. I’m not sure how much this would benefit from more processor power (it is really pretty much disk bound unless the OS caching of file data can help out here).

It is already coded to allow a certain amount of interruption (all Gramplets are run in ‘idle’ time at low priority when events from the user have already been processed). So it should not impose too much delay to the user events. This type of processing explains why Gramplets may lag in their display when the active changes.

If you think that having this Gramplet active is slowing down your changing of the active, it may be that it doesn’t ‘yield’ the CPU often enough when large trees are involved. Which could be fixed without multiprocessing.

The same may be true for some other Gramplets…

1 Like

Better yielding by the Gramplets is more the point of interest.

The lag in any Gramplet isn’t as important as the slowness of the Views when those Gramplets are stealing cycles. There are times when I have to click 2 or 3 times to activate a field in the view or to bring up the next edit dialog.

Populating the Select Place dialog is so agonizingly slow (with 2,500 places) that I use the clipboard to avoid opening it. But it doesn’t have a consistent delay in populating in successive uses. Sometimes the delay is only a second but it can stretch to 20 seconds.

As I switch back & forth between the People & Relationship category views, occasionally the refresh snaps back to the other category view. (Oddly, double-clicking for switching category views is both faster & more reliable than single-clicking.)

If I click into another part of the View screen space too quickly after clicking OK in an Edit dialog, it pops an unexpected error. But it just hasn’t finished housekeeping.

If I work slowly & deliberately, the interface is ok. But the faster I try to move between windows, the more problems it exhibits.

I realize that I should just close the Gramplets… but they’re too darn helpful!

I have a “test” database with a little over 4500 people with relations going back to approx year 900 (some Norwegian Kings)…

When I use Deep Connections with my self as Home Person (born 1967) and select that person, it takes approx 20 sec before the list is populated… my 8 threads are utilized at 14-20%…

Gramps only use 139MB of memory, even though I have 32GB installed, and 16GB free…

And Gramps doesn’t even touch my SSD’s while running the Deep Connection… that get me wondering if my whole sqlite DB is loaded to memory

When running the Pedigree Gramplet on that same database, selecting the same people (all are 20 generations back approx)… the Pedigree update instantly, there are less than half a second delay…

I don’t have any bigger database with people in it in Gramps, but I have tested a lot on my “Place-Database”, both on sqlite and mongodb, and its always the initial reading and populating of Gramps that takes time for me, not the edit…

So could you tell approx how many people you have in your database, and if the database are on a ssd or a spinning harddrive?

It might be the differences in hardware that do this…

As I wrote earlier, I have a intel I7 4820K running at 3,7 Ghz (its an old CPU 6-7 years old), but both Windows, all software and all datastore are on SSD’s…

But I could easily move my Gramps Database to a HDD to test if it was needed…

I do find it strange that it seems that my sqlite database are loaded to memory (i have not yet tested out writing to the database in this case), but I found the same thing with my place database, it was the initial reading and population of Gramps that take the most time…

I was thinking about creating a mongodb instance with this data to see if there was any difference, but I do not have more space to use on my SSD where the mongodb storage are, so at the moment I can’t…

1 Like