I think we could target some common collections, such as what @ennoborg suggested, get_all_relatives(). A few would immediately be useful: get_all_ancestors(), get_all_descendants(), etc.
Will require a two step process:
move such functions to the database layer (base class)
I enjoy following these discussions and gain some really useful insights
into the workings of GRAMPS
More so since I took the decision to stick v5.1.6 given there was
nothing I could see in 5.2 worth my time on the upgrade process at which
point I thought about what would entice me to think again.
Elimination of Data Blobs
GRAMPS capable of running successfully from a NAS (not multi user)
A hierarchical structure for Repository/Source/Citation and
each source having a reference to the publisher(s) used.
GRAMPS not running due to upgrade of operating system.
So now I can relax and enjoy the discussions more as intellectual concepts.
phil
Well, this is interesting: I ran the AncestorsOf() rule/filter on all 2157 people in the Example tree.
Without any changes, it clocks in at 2 minutes and 10 seconds.
With some tweaks, it clocks in at 2 seconds. For all of them.
I did all kinds of low-level things (SQL, unpickle, JSON_EXTRACT(), ast.literal_eval), but the biggest difference was a high-level code change. After the ancestor-collection part of the code was completed, it takes another 2 minutes doing things it doesn’t need to do.
I think we can get great gains without having to change much except how filters are applied.
Yes, all of the tree walking rules build a list of people in the prepare method. The apply method then unnecessarily loops through the database matching the objects to the existing list.
The poor performance finding duplicate people is also a high-level issue which could be improved by retrieving people from the database in surname groups.
I’m sure that there are other places where a better choice of algorithm would help with performance.