Strange filter behavior in 6.0.1

Exploring 6.0.1 on Win10 and have been testing the filters.

Tried using the custom filters carried over from 5.2 and was seeing strange results so scrapped that custom_filter file and started from scratch.

After creating a new set of filters, I wanted to create one final filter that combined people from the 4 other filters. There are a lot of overlap between these filters.

Filter 1 returns 34K people
Filter 2 returns 42K
Filter 3 returns 110K
Filter 4 returns 58K

The combined filter is

People matching the <Filter1>
People matching the <Filter2>
People matching the <Filter3>
People matching the <Filter4>

Option set to: At least one rule must apply.

This filter returns just 114 people. And no, it is not 114K, which I would expect. Just 114.

Could you run it through the Test in FilterParams and see if the results are the same? (And double check that the settings did not change somehow.)

Sending you the small custom_filter.xml of today’s filters.

1 Like

Wow. That’s some recursive filtering!

I needed the sibling side filters because I know of 4 endline grandparents with a known sibling but still do not know the parents. I suppose I could add them as individuals to the family filter. But I like the concept of a universal filter. In 5.2 I run it off of the active person filter. Unfortunately, it is not working in 6.0. In 5.2 I can select an individual and filter for their complete family including all immediate in-laws and any step children.

By the way, Did you test it and are you seeing the same results?

I did test it. But I’m comparing a different tree so the guage of whether the results are the ā€œsameā€ has changed.

The multistage filter had the intersection of the 4 filters, not the cumulative results. So i switched that over.

In the example.gramps Tree and (after using FilterParams to reset the focal person) i5 gave a couple hundred matches.

Just imported the Example.gramps from the 6.0.1 files and ran the set of filters I sent you.

Using our matryed hero Lewis Anderson Garner von Anderson as the test, I entered his ID into filter ā€œFilter 1a: Ancestors ofā€¦ā€.

ā€œFilter 1a: Ancestors ofā€¦ā€ returned 7 people
ā€œFilter 1b: Siblings of Ancestorsā€ returned 18
ā€œFilter 1c: Descendents ofā€¦ā€ returned 93
ā€œFilter 1d: Find the Spoucesā€ returned 46

Now running the filter…
ā€œFamily 1A: The Family and their Spoucesā€ returned 0

Filter 1A’s code

    <filter name="Family 1A: The Family and their Spouces" function="or">
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1b: Siblings of Ancestors"/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1c: Descendents of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1d: Find the Spouces"/>
      </rule>
    </filter>

The full set of filters

<?xml version="1.0" encoding="utf-8"?>
<filters>
  <object type="Person">
    <filter name="Family 1a: Ancestors of..." function="and">
      <rule class="IsAncestorOf" use_regex="False" use_case="False">
        <arg value="I000044"/>
        <arg value="1"/>
      </rule>
    </filter>
    <filter name="Family 1A: The Family and their Spouces" function="or">
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1b: Siblings of Ancestors"/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1c: Descendents of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1d: Find the Spouces"/>
      </rule>
    </filter>
    <filter name="Family 1b: Siblings of Ancestors" function="and">
      <rule class="IsSiblingOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
    </filter>
    <filter name="Family 1B: The Family, their Spouces an in-Laws." function="or">
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1b: Siblings of Ancestors"/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1c: Descendents of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1d: Find the Spouces"/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1e: Spouces and the extended in-Laws"/>
      </rule>
    </filter>
    <filter name="Family 1c: Descendents of..." function="or">
      <rule class="IsDescendantOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
      <rule class="IsDescendantOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1b: Siblings of Ancestors"/>
      </rule>
    </filter>
    <filter name="Family 1d: Find the Spouces" function="or">
      <rule class="IsSpouseOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
      <rule class="IsSpouseOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1b: Siblings of Ancestors"/>
      </rule>
      <rule class="IsSpouseOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1c: Descendents of..."/>
      </rule>
    </filter>
    <filter name="Family 1e: Spouces and the extended in-Laws" function="and">
      <rule class="IsSpouseOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1d: Find the Spouces"/>
      </rule>
      <rule class="IsChildOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1d: Find the Spouces"/>
      </rule>
      <rule class="IsParentOfFilterMatch" use_regex="False" use_case="False">
        <arg value="Family 1d: Find the Spouces"/>
      </rule>
    </filter>
  </object>
</filters>

I investigated this a bit.

It seems that there might be an issue with the new Optimizer class in Gramps 6.0.

Not sure why but this workaround seems to help: add a line in the method apply_logical_op_to_all in the file gramps/gen/filters/_genericfilter.py (around line 149) as follows:

    optimizer = Optimizer(self)
    handles_in, handles_out = optimizer.get_handles()
    if not handles_in: handles_in = None    # add this line
2 Likes

@DaveSch could you file a bug report so that the patch by @kku will get into the 6.0.2 roadmap?

I added the line to _genericfilter.py but am not seeing any change in the results on my database.

I had tried the same set of filters on the Example database using LAGvZ as the base person. Before the modified code, zero (0) people were returned. With the applied code, 137 people were returned. This 137 number appears to be correct.

Running the exact same set of filters on my 5.2.4 database returned 165,278 people.
Running it, with the altered _genericfilter.py, returned 114. :roll_eyes:

This is just guessing but I have two suggestions:

  1. Remove Optimizer completely by changing the lines to:
#       optimizer = Optimizer(self)
#       handles_in, handles_out = optimizer.get_handles()
        handles_in = None
        handles_out = []
  1. Add debug logging with

    gramps -d .filter.optimizer -d .filter.results

Maybe the log output gives a hint.

Making the change to:

        handles_in = None
        handles_out = []

Actually solved the problem. 165264 people returned in the filter.

Here is what displayed in the debug window

setup debugging .filter.optimizer
setup debugging .filter.results

(gramps.exe:10956): libenchant-WARNING **: 15:24:40.000: Bad UTF-8 sequence in C:\Users\daves\AppData\Local\enchant\en_US.dic at line:263

(gramps.exe:10956): libenchant-WARNING **: 15:24:40.004: Bad UTF-8 sequence in C:\Users\daves\AppData\Local\enchant\en_US.dic at line:850

(gramps.exe:10956): libenchant-WARNING **: 15:24:40.006: Bad UTF-8 sequence in C:\Users\daves\AppData\Local\enchant\en_US.dic at line:851

2025-04-24 15:27:40.006: DEBUG: _genericfilter.py: line 276: Prepare time: 140.12473893165588 seconds
2025-04-24 15:27:40.015: DEBUG: _genericfilter.py: line 153: Optimizer handles_in: None
2025-04-24 15:27:40.015: DEBUG: _genericfilter.py: line 157: Optimizer handles_out: 0
2025-04-24 15:28:45.741: DEBUG: _genericfilter.py: line 291: Apply time: 65.72609353065491 seconds

I’ll look to see what it did not like in the dictionary file.

Ran the same filter on the example database for LAGvZ with 137 people returned.

This was added to the debug window for that filter.

(gramps.exe:10956): Gtk-CRITICAL **: 15:38:24.208: gtk_tree_model_filter_get_path: assertion ā€˜GTK_TREE_MODEL_FILTER (model)->priv->stamp == iter->stamp’ failed

(gramps.exe:10956): Gtk-CRITICAL **: 15:38:24.211: gtk_tree_model_filter_get_path: assertion ā€˜GTK_TREE_MODEL_FILTER (model)->priv->stamp == iter->stamp’ failed
2025-04-24 15:38:52.447: DEBUG: _genericfilter.py: line 276: Prepare time: 1.5196986198425293 seconds
2025-04-24 15:38:52.447: DEBUG: _genericfilter.py: line 153: Optimizer handles_in: None
2025-04-24 15:38:52.448: DEBUG: _genericfilter.py: line 157: Optimizer handles_out: 0
2025-04-24 15:38:53.286: DEBUG: _genericfilter.py: line 291: Apply time: 0.8390529155731201 seconds

If anyone is interested, the lines in the dictionary file were words in the possessive ('s).

I deleted them.

1 Like

I see a potential issue. Can someone check to see what handles_in is before you set it to None. (I’m guessing it is an empty set). If so, we can fix this without this workaround.

1 Like

The problem reproduces with a simpler filter

    <filter name="Family 1A: The Family and their Spouces" function="or">
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1a: Ancestors of..."/>
      </rule>
      <rule class="MatchesFilter" use_regex="False" use_case="False">
        <arg value="Family 1b: Siblings of Ancestors"/>
      </rule>
    </filter>

As @DaveSch says, using the example tree, the partial results are:
ā€œFilter 1a: Ancestors ofā€¦ā€ returned 7 people
ā€œFilter 1b: Siblings of Ancestorsā€ returned 18

For some reason the optimizer.get_handles method is applying an and condition
handles_in = intersection([handles_in] + selected_handles)
handles_in is length 7, selected_handles is length 18 which looks correct. The intersection has length 0, rather than 25

@dsblank we have two sub-filters, each of which has the ā€œandā€ condition. The combined filter then uses the ā€œorā€ condition. Therefore, I don’t think that the code should even be applying an intersection at this point. It looks to be applying the condition from the sub-filter rather than the condition that should apply to the combined filter.
If I modify the two sub-filters to use the ā€œorā€ condition, then the expected result of 25 people is returned.

@dsblank I wasn’t quite sure where you meant so this might be more detail than you need

In Optimizer.get_handles, self.all_selected_handles contains the results of the two sub-filters. These appear to be correct, having the expected length.
Each of the sub-filters contain only a single rule but have the logical operator ā€˜and’ (which obviously has no meaning given there is only a single rule). The top level filter has the condition ā€˜or’

In the initial loop

     for inverted, logical_op, selected_handles in self.all_selected_handles:
            if logical_op == "and" and not inverted:
                LOG.debug("optimizer positive match!")
                if handles_in is None:
                    handles_in = intersection(selected_handles)
                else:
                    handles_in = intersection([handles_in] + selected_handles)

In the first iteration logical_op is ā€œandā€ and we set handles_in to intersection(selected_handles). handles_in` has length 7.

In the second iteration, logical_op is ā€œandā€ again and we set handles_in to intersection([handles_in] + selected_handles). handles_in now has length 0 i.e. there is no intersection. Logically this is correct - you would not expect an intersection between a persons siblings and ancestors.

handles_out also has length 0 and the method returns two sets, each of length 0

execution return to GenericFilter.apply_logical_op_to_all where the following log entries are generated

2025-04-25 08:35:58.879: DEBUG: _genericfilter.py: line 150: Optimizer handles_in: 0
2025-04-25 08:36:19.234: DEBUG: _genericfilter.py: line 154: Optimizer handles_out: 0

I’m not sure that it is using the correct logical_op. It appears to be using the logical_op from the second sub-filter to determine how to combine the results of the first and second sub-filter. It seems that it should be using the logical_op of the parent filter (ā€˜or’) which would result in handles_in being None.
If I force handles_in = None after the first loop in Optimizer.get_handles then I get the expected result, with the log entries:

2025-04-25 08:46:27.479: DEBUG: _genericfilter.py: line 150: Optimizer handles_in: None
2025-04-25 08:46:28.044: DEBUG: _genericfilter.py: line 154: Optimizer handles_out: 0

Thanks @SteveY ! I think you have identified the issue. Do you want to make a PR? Otherwise I’ll fix this weekend

Bug filed.

I’ve posted a potential fix for review
Fix Filter Optimizer Logical Op Ā· Pull Request #2052

3 Likes

After installing your changes (and remembering to revert back to the original _genericfilter.py), I am getting the expected number of returns. Tested it on the example database and LAGvZ had the expected 137 family members.

3 Likes