Dear Gramps developers,
I recently completed integrating PostgreSQL Enhanced with GrampsWeb. The exercise required 12 (!!!) workarounds to handle architectural assumptions about storage. I’m sharing this as a concrete example of why we need to revisit the Gramps storage layer architecture.
Technical Findings
Filesystem Coupling in Core
gramps/gen/db/dbconst.py:73
DBBACKEND = "database.txt" # Hardcoded assumption
This constant propagates through the entire stack. GrampsWeb’s is_tree()
function literally checks for this file to determine database existence. For a pure-database backend, I had to create fake filesystem structures.
Missing Base Class Methods
GrampsWeb calls these methods not defined in DbGeneric:
get_transactions(page, pagesize, **kwargs)
set_metadata(key, value)
get_metadata(key, default)
Every backend implementing web support has to add these independently. There’s no interface contract.
Static Backend Registration
gramps_webapi/dbmanager.py:44
ALLOWED_DB_BACKENDS = ["sqlite", "postgresql", "sharedpostgresql"]
Adding a backend requires patching source files. We should be using entry points or a registry pattern.
SQL Dialect Issues
psycopg2 vs psycopg3 incompatibility in PostgreSQL backends:
-- psycopg2
WHERE x.name IN %s
-- psycopg3
WHERE x.name = ANY(%s)
Each PostgreSQL variant handles this differently. We need dialect abstraction. Personally I LOATHE psycopg2 due to many long years of pain in implementing its use. You won’t find it in any code I use. It’s a database driver from 2008 replaced in 2019. There’s frankly no reason to use it. And getting it to work in the real world is usually painful due to its slightly different syntax and its age, making finding drivers a pain with most implementations.
Architectural Impact
But this isn’t PostgreSQL-specific. Any non-filesystem backend faces these issues:
- Backend proliferation: We have 3+ PostgreSQL variants because each developer solves these problems differently
- Code duplication: Same features reimplemented in each backend
- Testing complexity: No interface contract means no meaningful mocks
- Feature gaps: Backends implement different subsets of functionality
The Deeper Issue: DBAPI Architecture
These problems stem from fundamental issues with the DBAPI layer:
- Not Actually DB-API 2.0 Compliant: Despite its name, DBAPI doesn’t follow Python’s DB-API 2.0 specification (PEP 249)
- SQLite-Centric Design: The entire abstraction assumes SQLite patterns
- Filesystem Coupling: Directory-based initialization (
_initialize(directory, username, password)
) - No Connection Abstraction: Single connection architecture prevents pooling, replicas, or scaling
As documented in previous analyses, DBAPI is actually three things mixed together:
- SQLite-specific implementation details (should be in SQLite class)
- SQL database abstractions (should be in a SQL base class)
- General database operations (already in DbGeneric)
The Storage Layer Challenge
The current inheritance hierarchy creates cascading problems:
DbReadBase + DbWriteBase
↓
DbGeneric
↓
DBAPI ← [SQLite assumptions baked in]
↓
SQLite / PostgreSQL (forced to work around DBAPI)
Meanwhile, sophisticated patterns exist but are disconnected:
- Proxy pattern (PrivateProxy, LivingProxy, FilterProxy)
- Transaction management (DbTxn)
- Multi-tenant support (SharedDBAPI)
These need to be preserved while fixing the storage layer.
Architectural Direction
What we need:
1. Clean Separation of Concerns
- Move SQLite specifics out of DBAPI into SQLite class
- Create proper SQL abstraction base class
- Enable non-filesystem backends without workarounds
2. Modern Storage Patterns
- Connection string support (not just directory paths)
- Connection pooling capabilities
- Support for cloud databases
- Native JSON types where available
3. Backward Compatibility Through Layering
Following the “Layered Core with Compatibility Shell” pattern:
- Existing code continues to work unchanged
- New capabilities available through opt-in
- Progressive migration without breaking changes
The goal isn’t to prescribe a specific solution, but to enable the flexibility that modern genealogy software needs while preserving Gramps’s stability and existing patterns.
Why This Matters Now
The genealogy landscape is evolving:
- Multi-user Collaboration: Families want to work on shared trees simultaneously
- Cloud Storage: Users expect data accessible from anywhere
- Scale: Large trees need real database features (indexing, query optimization)
- Modern Backends: Graph databases for relationships, time-series for events
The current architecture blocks all of these use cases.
Evidence from the Field
My PostgreSQL Enhanced integration is just one example. We’re seeing:
- Multiple PostgreSQL variants (postgresql, sharedpostgresql, postgresqlenhanced)
- Each solving the same problems differently
- No standard way to support cloud databases
- Repeated workarounds for the same architectural issues
Moving Forward
This isn’t about one backend or implementation preference. It’s about enabling Gramps to support the storage systems users need, whether that’s:
- Local SQLite for individual users
- PostgreSQL for concurrent access
- Cloud databases for web deployment
- Graph databases for relationship analysis
The 12 workarounds I needed demonstrate that current Gramps architecture is blocking innovation. We can fix this while maintaining full backward compatibility through careful architectural evolution.
For Discussion
Rather than proposing a specific implementation, I’d like to discuss:
- Do we agree that storage layer abstraction is needed?
- What are the non-negotiable compatibility requirements?
- How can we enable modern backends without disrupting existing ones?
- What’s the right timeline for architectural evolution?
The PostgreSQL Enhanced integration works - but it shouldn’t require 12 hacks. Let’s discuss how to remove these barriers for all backend developers.
Working code and detailed analysis available for review