I’m currently using gramps-web to manage my family tree and would like to ensure the generational data is preserved long-term.
Currently, all media files are stored in S3, and I’m setting up bucket-to-bucket replication for disaster recovery. However, the tree data itself isn’t automatically backed up — it only exists within the web instance unless I manually export it. That makes it quite fragile in the face of potential risks, such as web vulnerabilities, data corruption, or even data centre incidents.
I’m looking for a solution that can periodically fetch data via the API and write both tree and media data to a backup destination of my choice. Ideally, it would support common storage backends such as S3 or Google Drive, since these are widely available and often remain usable even without active payment.
In an ideal setup, backups could be written to multiple locations simultaneously. This could potentially be implemented as a sidecar container in docker-compose, but it would require weird hacks to make it properly schedulable. I’m wondering if there’s interest in making it a module within the web API itself — especially if others face similar needs. It that’s the case, I would look into using the already available mechanisms, such as remote media credentials and Celery; from what I can tell, Celery provides some task scheduling functionality.
Would love to hear how others handle this, or if anyone has thoughts on building such a backup mechanism.
I agree that would be very very useful feature. One trivial solution I thought of (but haven’t done yet) is to create a simple script that fetches an XML via the API. It’s probably just a few lines of code and could be externalized. What’s not super nice about it is that it would require hardcoding username and password, or at least a refresh token (which has the same security implications). So that could be your version 0.
Beyond that, I agree some kind of automated backup would be great, kind of like in Home Assistant.
It’s true that Celery beat works well for scheduled tasks, the reason I’ve shied away from it so far (it would have been very useful for the telemetry function, for instace) is that it would require another container, so everybody would have to modify their docker-compose files manually to make it work, which I think is a big burden, given how useful this feature would be.
A hack (the one I also use for telemetry) is to use a pseudo-cron job that is triggered on every HTTP Request - it means the task wouldn’t run if no-one accesses the tree for a long time, but that’s probably OK for a backup task, as it means the data wouldn’t change.
Please do open a feature request on the backend repo where we can track it.
By the way, independently of the backup feature, we also need a proper restore feature - you can create a new instance from a backup XML anytime, but you cannot reset an existing instance to a backup. This is tracked here:
(Small addendum: in case you’re wondering how I can sleep at night without having this in place for my tree - it’s because I use Gramps Desktop with the sync addon, and Gramps desktop creates automated backups which I back up along with all other relevant data on my hard disk. So I’m covered - but of course we should still implement this feature in Gramps Web itself because not everybody uses Gramps Desktop.)
My concern is Gramps Web being the source of truth, as my father, from far abroad, and I are the only people collaborating on the tree. But I think your case of local->remote is more common for the majority of users.
On a side note: that’s where you’re right about the burden of editing the docker-compose and there aren’t many discussions online about the hosting aspects, such as CDN’s, security, and availability, I could contribute to, but likely gramps-web is rarely a high-load software in these regards.
The pseudo-cron idea is okay, However, it doesn’t consider retention policies in destinations. Those could delete older files if the storage is overflowing or just on a certain date threshold. That’s why we want to make sure actual backups are made.
Thanks for the input, I’ll draft out something for myself, but also create a feature request.