Sure, send me your email.
Interesting… I have a database that I had downloaded from Ancestry.
It does not have any Enclosed By entries.
I went in and selected an even and added an Enclosed By to the Place, and it saved in less than a second.
So, in my main database there has to be something causing this condition.
Unless it relates to the number of enclosed by records…
And in the mean time, I did some tests too, by adding a residence event to your person, as if you paid a visit to Alcester, Warwickshire, England which is enclosed by the United Kingdom after 31 dec 1800. And on Windows, running in Virtual Box, inside the Linux session that I’m using to type this, the first OK needs 15 seconds, just like you reported earlier.
On the Linux host, with the Windows session running in the background, it takes about 10 seconds.
I just added another residence event, as if you visited Amsterdam, Noord-Holland. Nederland, which is enclosed by Europe, and that’s fast. And adding another residence event linked to a new place in the UK was slow again.
The difference between these situations is, that in Europe, you had no enclosed cities, and in the UK, you have hundreds, maybe even a thousand. They’re all directly enclosed by the UK, and not in a tree under England, Scotland, or Wales, separated by county. And I think that’s the culprit, that when you add a new city there, Gramps needs to rebuild the whole list under UK. Adding a new residence inside Canada is also quite fast.
There is something odd though, because adding a test location under UK from the location view is fast too. But maybe that is, because at that moment, the location view is loaded in memory, while it’s not when you’re working in the people or event view.
A quick glance at the database, with the DB Browser for SQLite shows that the place table has no index for names, but only for titles, and the consequence of that is that the program has to read all entries enclosed by the UK to find a location to insert the new place.
This is still a guess, but the symptoms point in that direction.
So a workaround would be to separate England, Wales and Scotland into Enclosed By groups, which would lessen the number of records that have to be read???
Yes, that’s right. And that’s easy with the Place Update Gramplet, which can generate a hierarchy from the names. It will take a while, but it works well, and one way to work with it, is creating a filter on “, England”, selecting all, or a few hundred places coming through that filter, and then switching to the Place Update Gramplet to create a hierarchy for the selected elements.
When you do that, the screen will freeze for a while, but when the creation is done, all updated places will disappear from the fltered results, so that you can repeat the process, until nothing comes out of the filter. You can then repeat the process for Scotland and Wales.
And if you want, you can also just filter on “,”, because that will select all locations that have a comma in them, which will be removed after processing. That’s what I did here when I migrated from 3.4.9 to 5.1.5 (just for testing).
Maybe I haven’t done it right. Just experimented with Scotland.
It took each name and broke it out into a hierarchy. E.G. duplicating Scotland many times and each of the other parts of the name. It did not group them. Do I have to do a merge or something?
Non of the new entries have the parameters of the top level “Scotland” or “United Kingdom” that is there.
OOPS! I saw the same here, on your database.
When I run it on mine, which is a fresh import from 3.4.9, because I still work with that, it works OK, creating 1 Scotland, or 1 Nederland. And maybe that is, because my database has no hierarchy before the conversion.
This suggest that the tool only works well for places that are not in a hierarchy already. ![]()
But it says this in the help file:
Duplicate place names at the same level are automatically merged.
I guess I’ll have to do it manually. That sucks.
There is a remedy. If you check Clear original enclosing places the original link to the UK will be cleared, so that you get 1 top level England, Scotland, Wales, and so forth. I tried that here, before reading the wiki, and it works, see:
https://www.gramps-project.org/wiki/index.php/Addon:PlaceUpdate_Gramplet
What a relief! ![]()
You also want to check out the Automerge feature of Kari’s experimental MultiMerge Gramplet. You can do extended selections and pack down the Redundant place list more quickly.
Just want to say thanks to those that helped with this issue. Now that there is a hierarchy in Places, the performance is acceptable. About 1 second for a new record.
I believe that there is to be changes to Places in the 5.2 release. Maybe a bug should be opened so this condition is not overlooked in the new code.
A quick check suggests that Gramps 5.2 will be OK. I tested that with the latest source code from GitHub.
It is always true in reality. A blob always goes to a single column in a table, so it’s quite a simple write to disk. Splitting the same object to a normalized representation in SQL means that the data must be written to more than one column, and most often also to more than one table. For example, when you create a person, or a place, you need to write to a separate table for (alternate) name variants, and that means more work for the database server, and more writes to disk. I know that, because Ive followed a couple of courses for Oracle, and I looked at analysis reports to see how fast data was read or written, and how the SQL statement was translated to actions on tables.
Also, in SQLite, a view is read only, so you can’t commit anything to it. You can write to a view in Microsoft SQL Server, but most of us don’t use that, and I don’t think that we support it in our DB API.
This does not mean that I’m against normalizing our data model, and writing real SQL. It just isn’t faster, and it doesn’t automatically solve all speed problems. I personally think that it’s a better solution than changing the blobs to JSON, because it saves the time and coding complexity that’s involved with mixing technologies. And real SQL makes it way easier to read and write data with other tools.
You are right, normalized tables probably won’t be faster for loading objects than just loading them from blobs. However filtering will be much faster, even if data is split into several tables, because Gramps no longer needs to load the entire object all the time like it’s done right now.
I understand what you mean, but … when you filter on a part of a name, which is the filter that I use most, the database needs to read all rows in the table. This also happens when you filter on the full text on a column that is not indexed. This is what we call a full table scan, see
The process may still be fast when tables are cached by the database engine, but I think that in Gramps, they are cached by the software itself, meaning that the filtering is done in the view that’s created by Gramps.
I wonder if some timing tests would be beneficial?
During tests for the new CardView’s, @cdhorn has rendering timehacks output to the console. Can similar logging insertions be incorporated to a 3rd-party add-on fork of the Filter Gramplets?
Maybe alternatives to the Filter Gramplet can be tested for optimization in different areas. (Like on that runs Custom Filters faster, or just given names.)
For instance, what are comparative timings on the same custom filter via the Filter Gramplet within a view, in the Filter Editor test, in @kku 's FilterParams test, the Tagging tool & in the export?
I need to add that my Postgresql runs on a Windows Machine with 12 cores 24 thread, the DB Engine has access to 64 Gigabyte of memory, the software is installed on a nvme SSD, the database and indexes are spread on 3 different nvme SSDs, all of the SSDs have 35-55% or more free space and the Swap file function in Windows is turned off.
The time on a HDD configured low ram computer will of course be totally different.
It would be nice to have logs that show timestamps, or even used CPU ticks, or similar things, at different places, so that we can see how long it takes to read large chunks of data from the DB, or to execute a filter, or to repopulate a view, after the filter as run. This is a quick way to see where the bottlenecks are, also for actions like Dave moving a town to a flat and wide list under the UK. These logs can quickly show places that may need improvement.
Some years ago, I had to build macros that simulated a lot of database functionality in a spreadsheet workbook. (Let’s skip the utter stupidity that made that necessary. Assume, correctly, that it had to do with a government bureaucracy directive that allowed no other option.)
For that tool, I had to set up a ‘debug’ global variable. When the variable was on, each module output a timehack (identifying the module name and parameter being passed) to a spreadsheet page at the entry & exit of the module. (Naturally, it also supported inserting another log-entry at other critical points.) It was innately pessimistic since writing to the spreadsheet slowed the module performance.
I know Gramps has a debug mode. Does it have something similar to the above (admittedly crude) hack?