RCS-based archives

When GRAMPS detects utility rcs is installed on the running computer, a new button Archive/Extract is available in the Manage Trees dialog. This button allows to create an XML “snapshot” of the current DB or to create a new database from a previous XML archive.

I have not yet fully analysed how the feature works, i.e. if the snapshots are always taken relative to the current state of the DB or relative to the latest snapshot to make profit of the “diff” capability of rcs and possibily minimise the size of successive snapshots.

Since rcs is somehow obsolete and no longer actively maintained if any (for source project management it has been superseded by more “modern” tools like git, svn, Hg, …), my question is about its present usefulness.

As far as you know, is the feature really used?

It is always possible to Backup instead which also produces an XML file. The only advantage with rcs would be saving only a “delta” over the latest backup but I doubt this is done or possible: this requires that the archives have a logical tree-like relationship among themselves. I suspect this is not the case and anyway when a previous snapshot is extracted, it creates a new independent DB of which future snapshots are not related to the original ones, thus cancelling the potential advantage of “delta-archiving” because new archives are hosted in a new archival tree (which therefore never takes the shape of a tree; it is rather a “comb”).

In this condition, is it opportune to consider dropping the rcs feature? Or else to redefine it under a strict, consistent specification?

Yes before a big edit or change, like a checkpoint you can roll back to quickly.

Semi-related conversation on reddit: Looking for workflow/toolset for storing GrampsXML in online git with an interesting idea by one of the commenters to replace RCS with Git for Gramps builtin Archive a family tree feature. I’d suggest converting the existing version control system (RCS or Git or any other) to addons that provide that functionality via a preference in settings the same as how you can select diffferent database backends in Gramps ?

Related Gramps feature request: 0004901: Add a new storage format base on GIT which has an interesting link to another persons method of using Gramps XML with Git

I see you last discussed this on the mailing list also a while ago Re: [Gramps-devel] Revision control feature? From: Patrick Gerlier 2020-03-25 17:57:33

I’d hazard no as it works on Linux and bothers no one on Windows , not sure about MacOs?

My workflow is a bit different. I don’t use Archive but Family Trees>Export which also gives an XML “snapshot”. The difference lies in the way you “comment” the export/archive. With Archive, your comment is shown in the Manage Trees dialog while in the Export case you must give a relevant file name. Basically, the result is the same. User-friendliness should be compared and rated.

As I mentioned, I doubt there is a substantial footprint advantage (total size of archives) because the archives are not linked in a parent-child relationship. I fear there are just siblings to the current DB and rcs doesn’t apply delta processing in this case.

Thanks for pointed me to this request. The comments rightfully points out the difficulty: BD XML description is “monolithic” presently, making difficult to take benefit of Git diff efficiently. In addition, we should check if records are always issued in same order. For instance, in the GNU suite for message translation (msgxxx and gettext), a small change in source strings may cause the *.po(t) files to undergo a dramatic change in the order of the various strings making diff comparison absolutely useless to have an idea about what changed. This means that any archival in an SCM (source code management) system will not use delta-backup at its best: practically the new complete string file is saved.

Before embarking into a tentative implementation, one must make sure that record issuance order is stable under any edits, including Gramps-id (hopefully, records are identified internally by their “handle” which never change over the life of a DB, but this “handle” is not the same from DB to DB: it changes when you Extract or Import).

I was using Windows when this thread was started and so the Archive button was not available.

But just tried the Archive using 5.2.1 on a Fedora box and am getting errors in the Info.

backup path: /home/<username>/Documents
Tree path: /home/<username>/.gramps/disable/grampsdb


It reports a bad path and claim BSDDB event though the Tree is SQLite

Found the Archive in the folder with the Database file. But that filename has a padlock unicode character?

References:

Note that 7 years ago, the GitHub repository of Michael Brown included a experimental javascript / batch file archiving tool for Gramps Trees. It is external and has a lot of hardcoded paths.

Maybe a Report could be written that references preferences and would output a CLI script for sequentially making Backups of Trees when a Current Backup isn’t already there?

@Nick-Hall

Is the RCS-based Archive and Extract a feature set that should deprecated? (Say in Gramps 6 ?)

It is not compatible with Windows. And a few month later on the same computer, the Interface feature no longer appears in the dialog. The Archive feature makes the Make a Backup section of the wiki more complicated for no apparent added value.

NO! I use it a lot.
Just because Windows doesn’t use RCS doesn’t mean we need to remove this feature!

1 Like

Thank you for the response! I’m using Fedora and its gone missing from the GUI. (Very happy with the faster performance of Gramps over the Windows version.)

When double-checking a wiki link in an answer in a recent reply, I found the Archive section lacking. (It does not say what format the Archive generates nor where files are written.) So that’s when I discovered the button was missing and that my previous attempt to use it gave errors.

The section also needs more info (or a link) on troubleshooting the RCS installation. How to verify it is installed or how to add it. The current link goes to external sites that say it is a failed concept.

Maybe the feature could be shifted to Addon status? Just so it can go through some enhancement with the prerequisites feedback in the Addon Manager?

The format is a RCS format and the file is stored in the database directory with the sqlite file. the name is something like “rev.gramps,v” and contains all the versions.

1 Like

Placeholder for new RCS Archive feature wiki article to be adapted to how Gramps leverages RCS and using the Archive commands from the CLI.

Perplexity AI. (2024, June 23). [Response to prompt to Introduce the Gnu RCS (Revision Control System), its format and naming conventions with reference endnotes in Chicago citing style. ]. https://www.perplexity.ai/

GNU RCS (Revision Control System) is a version control system that manages multiple revisions of files. Here are some key points about RCS:

  1. Purpose: RCS automates the storing, retrieval, logging, identification, and merging of revisions. It’s useful for text that is frequently revised, like source code, documentation, and papers[1][2].

  2. History: RCS was originally developed by Walter F. Tichy at Purdue University in the early 1980s as an improvement over the Source Code Control System (SCCS)[2][4].

  3. File storage: RCS stores the most recent version of a file in full, along with reverse differences (called “deltas”) to reconstruct earlier versions. This improves retrieval speed for recent versions[2][4].

  4. Commands: RCS includes several commands like ci (check-in), co (check-out), rcs (for RCS file manipulation), rcsdiff (for comparing versions), and rlog (for displaying log information)[1][4].

  5. File naming conventions:

    • RCS files typically have a “,v” extension added to the original filename (e.g., “file.txt” becomes “file.txt,v”).
    • By default, RCS files are stored in an “RCS” subdirectory[1][4].
  6. Version numbering: RCS uses a tree structure for version numbering. The main trunk uses numbers like 1.1, 1.2, 1.3, etc. Branches are created with additional number components, like 1.3.1.1[1].

  7. Locking: RCS uses a locking mechanism to prevent multiple users from simultaneously editing the same file[4].

  8. Keywords: RCS supports special keywords in files that can be automatically expanded to include version information, dates, author names, etc[1].

  9. Limitations: RCS operates on single files and doesn’t support atomic commits across multiple files. It’s primarily designed for single-user or small team use on a local filesystem[4].

GNU RCS is now maintained as part of the GNU Project and is distributed under the GNU General Public License[2][3].

Citations:

  1. GNU Project. “GNU RCS 5.10.1.” Accessed June 23, 2024. https://www.gnu.org/software/rcs/manual/rcs.html.
  2. GNU Project. “GNU RCS.” Accessed June 23, 2024. https://www.gnu.org/software/rcs/.
  3. GNU Project. “RCS - GNU Project - Free Software Foundation (FSF).” Accessed June 23, 2024. http://gnu.ist.utl.pt/software/rcs/rcs.html.
  4. Wikipedia. “Revision Control System.” Accessed June 23, 2024. https://en.wikipedia.org/wiki/Revision_Control_System.
  5. https://gnuwin32.sourceforge.net/packages/rcs.htm

Is it worth the effort? What would we gain? Does someone want to write similar archival functionality for Windows?

If it offers a proper database Diff, that implies a lot more efficient power for tools like GrampsWeb Sync and the Import and Merge. And resolves some of the grumbles about each.

In my workflow, I would like the abilities to take a series of backup files and generate DIFFs for each that could be explored. That could simplifies isolating a place where bad data was introduced.

But basically, while the feature is locked into the core, it is too hard to experiment.

Also, there has been discussion about inserting a comment about the revision in the archive. I recall several different approaches being proposed. And RCS has another.

Also, as Gramps Trees grow into the 100K people and larger range, it seems like the automatic timed backup overhead becomes excessive. However, an Hour’s worth of changes would be a tiny DIFF in comparison.

RCS would seem to be ideal for this. It stores deltas and allows both a revision number and log message.

It does seem ideal if it could be made OS independent. There are already users uploading to Gramps Web from all platforms.

The wiki article about CLI use talks about how to install RCS for use from the CLI. (Dunno if that also enables the Family Tree GUI.) And I still don’t understand why the Archive button disappeared from my Fedora.

What tools are available to Windows users for this purpose? Are there any additional features that would be useful?

I had asked Perplexity.ai (“what tools are available for Windows OS that is similar to GNU RCS?”) and its 1st option was RCS Browser for Windows and Linux.
(Also listed: Visual SourceSafe {VSS}, TortoiseSVN, Monotone, Bazaar.) All those except Visual SourceSafe are open source.

Personally, I would like to be be able to generate a script that builds an Archive with DIFFs from a series of backups. If the DIFFs were uncompressed XML (or could be viewed without TOO many steps) then we could begin to make informed choices about managing our backups. I might take my backups from the beginning and end of a month to generate a DIFF. Then clean up that to share the month’s work. (To GrampsWeb or for Merged import by a collaborator on their local DT instance of Gramps.)

You can always use zdiff to examine the differences between two backups. I’m not convinces that using these diffs to create archives would either be easy or useful.

The main use of the “Archive” button is to create snapshots of the database, which can be restored at a later date. Creating an archive before a major change seems very sensible.

Using archives is not essential. Backups provide very similar functionality, but the archives are quite simple and convenient.