Problem Statement (AI rewrite)
In Gramps, genealogical databases stored in SQLite can grow to contain tens of thousands of individuals, families, and associated records. As these databases scale, interactive features such as dynamic charting and data analysis become increasingly unresponsive. The primary bottleneck arises from the need to scan and process the entire dataset for each operation, leading to slow performance and analyses that are often skewed by outlier data or irrelevant branches.
Currently, the only practical workaround is to “fork” the tree: manually export a relevant subset of the database and re-import it into a new tree. This approach is cumbersome, error-prone, and disrupts workflow continuity. There is a clear need for a more efficient and persistent method to define and work with meaningful subsets of genealogical data within the same Gramps environment, without duplicating or fragmenting the database.
Human Postscript: There have been previous discussions about adapting to a limited scope Dashboard option. The current Dashboard has statistical gramplets that look at the whole tree. And once the tree has collateral line data, the statistics become less pertinent. (As an example, consider the Age Stats or Surname Cloud gramplets: their statistics results become meaningless with a whole tree. But could be very informative when applied to the just Ancestors and/or Descendants of the Proband/Active Person.)
Perplexity.ai suggestion
Possible approach: Persistent Subset via Proxy Tree Feature
To efficiently work with large Gramps SQLite databases, consider implementing a proxy tree feature that allows users to define and persistently store a working subset of the data:
- Subset Definition: Let users select a root person, family, or branch, and specify rules (e.g., ancestors, descendants, tagged individuals) to define the subset.
- Persistent Proxy Table: Store the subset’s handles (unique IDs) in a dedicated table within the SQLite database. This table acts as a filter for all queries and chart displays, so only relevant data is loaded and analyzed.
- Integration: Modify Gramps’ data access layer to use this proxy table as a filter, bypassing the need to scan the entire database for every operation. This approach leverages SQLite’s efficiency and avoids repeated export/import cycles[4][6].
- Performance: This method ensures interactive charts and analyses remain responsive, as only the proxy subset is processed, greatly reducing overhead and minimizing the impact of outlier data[2][3].
This approach is similar to pre-computing filter maps for fast access, as discussed in recent Gramps filter optimizations, and can be implemented using standard Python and SQLite techniques[3][6].
Citations:
[1] Gramps Performance - Gramps
[2] Tips for large databases - Gramps
[3] Making Gramps filters Faster, and then Superfast
[4] GEPS 010: Relational Backend - Gramps
[5] Gramps
[6] 2.5.2.4 Lab - Working with Python and SQLite Answers
[7] Collaborate on Optimizing a new Custom Rule
[8] http://app.aspell.net/create?max_size=70&spelling=US&max_variant=0&diacritic=strip&download=wordlist&encoding=utf-8&format=inline
Answer from Perplexity: https://www.perplexity.ai/search/as-an-expert-in-python-tree-da-JkbXuZw9SdupvH2A9.7GbA?utm_source=copy_output