Difference in DataDict vs Object Behaviour

@dsblank Hi Doug,
There appears to be a subtle difference in the way a true gramps object and a DataDict representation work.

Background: this is in the context of updating a custom event role type (e.g. “Father”) to a built-in event role type (e.g. 11 - Father)

First the code using a Person object

person = self.get_person_from_handle("10244800c8436c7c90191cadb6db")
print(f"person.event_ref_list[0].role: {person.event_ref_list[0].role.get_object_state()}")
event_role = person.event_ref_list[0].get_role()
print(f"event_role: '{event_role.get_object_state()}'")
event_role.set(event_role.string)
print(f"event_role: '{event_role.get_object_state()}'")
print(f"person.event_ref_list[0].role.get_object_state(): '{person.event_ref_list[0].role.get_object_state()}'")

Essential: get a person by handle, get the event role of the first event ref, set the event role based on the current string and in the very last line, observe that the data in the person has in fact updated. Exactly as expected. Here’s the full output

person.event_ref_list[0].role: {'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}
event_role: '{'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}'
event_role: '{'_class': 'EventRoleType', 'value': 11, 'string': ''}'
person.event_ref_list[0].role.get_object_state(): '{'_class': 'EventRoleType', 'value': 11, 'string': ''}'

The role type changes from 'EventRoleType', 'value': 0, 'string': 'Father' to 'EventRoleType', 'value': 11, 'string': ''

Now the same code using DataDict. The only change is the first line

person = self.get_raw_person_data("10244800c8436c7c90191cadb6db")
print(f"person.event_ref_list[0].role: {person.event_ref_list[0].role.get_object_state()}")
event_role = person.event_ref_list[0].get_role()
print(f"event_role: '{event_role.get_object_state()}'")
event_role.set(event_role.string)
print(f"event_role: '{event_role.get_object_state()}'")
print(f"person.event_ref_list[0].role.get_object_state(): '{person.event_ref_list[0].role.get_object_state()}'")

And the output

person.event_ref_list[0].role: {'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}
event_role: '{'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}'
event_role: '{'_class': 'EventRoleType', 'value': 11, 'string': ''}'
person.event_ref_list[0].role.get_object_state(): '{'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}'

In the final line, the role has not updated 'EventRoleType', 'value': 0, 'string': 'Father'
event_role has updated correctly.

It appears event_role is a value copy of the data when a DataDict is used, but a reference when the true object is used.

What have I overlooked?
Steve

Nothing :smile: I think that was a bug that is fixed in:

3 Likes

Unfortunately I don’t think it does fix the problem. I merged that PR on top of my code and still see the problem.
I don’t have a proxy DB in use and the only change in your PR that could be related is the change to json_utils.cpp

Ok, I’ll take look. Thanks for checking.

1 Like

Slightly updated test code. Uses role instead of get_role() to keep as a DataDict.
Also prints the type of each variable

        person = self.get_raw_person_data("10244800c8436c7c90191cadb6db")
        print(f"{type(person)} person.event_ref_list[0].role: {person.event_ref_list[0].role.get_object_state()}")
        event_role = person.event_ref_list[0].role
        print(f"{type(event_role)} event_role: '{event_role.get_object_state()}'")
        event_role.set(event_role.string)
        print(f"{type(event_role)} event_role: '{event_role.get_object_state()}'")
        print(f"{type(person)} person.event_ref_list[0].role.get_object_state(): '{person.event_ref_list[0].role.get_object_state()}'")

and the new output

Using DataDict
<class 'gramps.gen.lib.json_utils.DataDict'> person.event_ref_list[0].role: {'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}
<class 'gramps.gen.lib.json_utils.DataDict'> event_role: '{'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}'
<class 'gramps.gen.lib.json_utils.DataDict'> event_role: '{'_class': 'EventRoleType', 'value': 11, 'string': ''}'
<class 'gramps.gen.lib.json_utils.DataDict'> person.event_ref_list[0].role.get_object_state(): '{'_class': 'EventRoleType', 'value': 0, 'string': 'Father'}'

I expect the last line to be
‘value’: 11, ‘string’: ‘’

1 Like

and you can find a test db, using this handle, in the zip file of this commit
Coalesce event role types by stevenyoungs · Pull Request #2141 · gramps-project/gramps
the zip contains a full gramps DB directory.

Adding the following to the end of the code confirms that when a DataDict is used, event_role is a value copy.

print(f"{hex(id(event_role))}")
print(f"{hex(id(person.event_ref_list[0].role))}")

Using the Person object (self.get_person_from_handle) gives the same pointer for both objects

0x2d9df0752b0
0x2d9df0752b0

but when DataDict is used (self.get_raw_person_data), I get different pointer values for event_role and person.event_ref_list[0].role

0x2d9ddfeb750
0x2d9df0ccdd0

Unexpectedly, this code gives two different id values

        person = self.get_raw_person_data("10244800c8436c7c90191cadb6db")
        print(f"person.event_ref_list[0].role: {hex(id(person.event_ref_list[0].role))}")
        print(f"person.event_ref_list[0].role: {hex(id(person.event_ref_list[0].role))}")

Ok, first thanks for highlighting these effects and behavior.

I tried wrestling with the issue when we first designed DataDIct, but it became complicated. I looked at it again, and it still would be complicated.

I think the best, simplest solution is to make DataDict a read-only structure. That solves all of these issues without complication.

I’ll work up a PR that would ensure that they are read-only (and I’ll need to change the updated proxy PR mentioned above as well).