Which RegEx syntax does Gramps use?

There are evidently a variety of standard syntaxes for RegEx. Which Regular Expressions standard (and possibly , which sub-dialect) was implemented in Gramps?

I’m guessing that it is the POSIX rather than Perl. But there are 3 compliance levels of the IEEE POSIX syntax: BRE (Basic Regular Expressions), ERE (Extended Regular Expressions), and the deprecated SRE (Simple Regular Expressions)

If a person is looking for tutorials and cheatsheats on the net, it would be helpful to be certain of which they should be seeking.

Where did you found the SRE is deprecated ?
The differences between all these syntaxes are very minimal (escape some characters)

For gramps we don’t need to know the differences.

Perhaps you don’t understand how it works.
I use regexp every days with gramps, sed, grep and found no differences in a standard usage.

You have the resonse in the re.py file distributed by python

You have the following page in gramps documentation:
https://gramps-project.org/wiki/index.php/Gramps_5.1_Wiki_Manual_-_Filters

The Wikipedia page. It states SRE is deprecated in favor of BRE.

There are entire Books & Websites dedicated to mastering RegEx.

Adding syntax tutorials to the Gramps wiki is reinventing the wheel.

So, we don’t need a wiki anymore.

No. It just doesn’t make sense for the syntax RegEx. For the general syntax tutorial, we can suggest an appropiate external webpage in the See Also section. (We need a Gramps RegEx user to suggest a good external webpage with the right dialect of RegEx!)

Tutorials on the application of RegEx for Gramps data makes sense.

But adding tutorials on RegEx syntax would be like adding wiki tutorial on your OS or learning Python. These are already done better elsewhere. The wiki should only address how to they relate in an unusual way to Gramps.

If there is no difference, why are there different specifications that are still being maintained separately?

Gramps uses the Python version of regular expressions re — Regular expression operations — Python 3.10.2 documentation

I could not find any reference to suggest that it follows any of the POSIX levels, although for the usage of our filters I suspect that this doesn’t matter.

I think that for the vast majority of likely re used in Gramps, the relatively minor variations in syntax just doesn’t matter. I suppose that a very advanced user might be able to find some expression where it does; but I also suspect that user could read the Python docs to get the details.

Note in the documentation that the Python version of re has changed over time, usually in a backward compatible fashion. So the actual syntax of Gramps would be that of the version of Python that you have installed to run Gramps.

1 Like

I cannot help that the term “RegEx” always reminds me of a 2 second video clip from the “Wolf In the Fold” Star Trek (TOS) episode.

1 Like

in the documentation you cited, the following note is in the opening section of the docs.

See also

The third-party regex module, which has an API compatible with the standard library re module, but offers additional functionality and a more thorough Unicode support.

Given that Gramps is heavy in Unicode use (and the recent inquiry about Case Sensitivity not being controllable), is that third-party regex more appropriate for Gramps than the standard library re module?

If the RegEx syntax is variable… based on the Python version in use, then RegEx “help” that has a dynamic URL would be useful.

One assumes that the basics will be fairly static. But possibly still very un-intuitive and needing documentation.

For example, I just needed to search for name prefixes that were not blank. It is very unlikely that I would intuitively know that the string (.|\s)*\S(.|\s) is a usable RegEx expression to search for a non-NULL string.

1 Like

This is my favorite regex debugger which I use all the time (It is necessary for some cybersecurity competitions I do): https://regex101.com

Perhaps a help page on using such a debugger would be useful. We can also establish a standard tag that can be used to label regex commands created by gramps users so they can be searched in the “community patterns” and occasionally grab particularly useful examples and add them to the wiki.

1 Like

I’d actually be willing to work on this. I enjoy writing documentation and could probably put something together.

1 Like

I would be roasted on a pyre if I encouraged your splitting attention from development to User Documentation!!!
The RegEx entry in the Gramps Glossary has an internal link (to an embryonic descriptions of the expression syntax) and an external link (to this Discourse discussion thread) where the documentation can evolve.

But I do advocate GUI hotlinking every incidence of a RegEx option. That link would be to “help” in composing RegEx expressions. Whether that “help” is: hotlinking to the Gramps Glossary Entry, or; the popping up of a RegEx composition/debugging tool; is a design choice… and beyond my ken.

Ha! I look at it more as an opportunity to gain a deeper understanding of some of the Gramps features for future development purposes. It also looks like I might be able to finish multiple projects over the course of the summer.

2 Likes