[OCWR] Week 13 - OpenCitations Weekly Report
Week from Oct 26 to Nov 01
Introduction
During the thirteenth week, many changes were made to the code of oc_ocdm
. Exploiting the presence of the flags
introduced in the last week report, I added a method to mark an entity as to be deleted (the “delete” method). I added two more fields
to the GraphEntity
class, I added the “remove” methods to the ProvEntity
class (for which they were still missing), I removed every
entity method that had inverse logic and/or that used to modify not its instance but the one of another entity, I added “getter” methods
to easily extract information out of the entities and, finally, I added the “merge” and the “import_from_graph” methods.
Report
Various fixes to the codebase
“Remove” methods in the ProvEntity
class have been missing since the 1.1.0 release. I added them and I fixed the test functions
accordingly.
Across the various entities, a few methods had a strange behaviour: either they applied an inverse logic with respect to the OCDM prescriptions, or they added/removed a triple not from their own graph but from the one of another entity. This used to lead to a certain amount of unpredictability since the user couldn’t know in advance what entity would have been affected by the execution of a method on another entity. Since this would have made adding the new “getter”, “merge” and “import_from_graph” methods very difficult, I chose to fix this problem once for all.
The following changes were made:
- AgentRole
- from
follows(self, ar_res: AgentRole)
tohas_next(self, ar_res: AgentRole)
- from
remove_follows
toremove_next
- added
is_held_by(self, ra_res: ResponsibleAgent)
andremove_held_by
- from
create_publisher(self, br_res: BibliographicResource)
tocreate_publisher(self)
- from
create_author(self, br_res: BibliographicResource)
tocreate_author(self)
- from
create_editor(self, br_res: BibliographicResource)
tocreate_editor(self)
- from
remove_role_and_document
toremove_role_type
- from
- BibliographicReference
- added
references_br(self, br_res: BibliographicResource)
andremove_references
- added
- BibliographicResource
- from
has_part(self, br_res: BibliographicResource)
tois_part_of(self, br_res: BibliographicResource)
- from
remove_part
toremove_part_of
- deleted
contained_in_discourse_element(self, de_res: DiscourseElement)
- added
has_contributor(self, ar_res: AgentRole)
andremove_contributor
- deleted
has_reference(self, be_res: BibliographicReference)
andremove_reference
- from
- DiscourseElement
- deleted
contained_in_discourse_element(self, de_res: DiscourseElement)
- deleted
- ResponsibleAgent
- deleted
has_role(self, ar_res: AgentRole)
andremove_role
- deleted
New GraphEntity fields and new getter methods
I added two new fields to the GraphEntity
class: preexisting_graph
and merge_list
. While the first one consists in a
rdflib.Graph
which can be used to hold the representation of the already existing persistent data related to the considered
GraphEntity
instance, the second is the list of GraphEntity
instances (possibly empty) that have been merged into the considered one.
I then added the “getter” methods to every entity class. Now it’s much simpler to read the data contained inside an entity, just by
calling the corresponding method on the considered instance. All this methods have their name starting with get_
and no parameter
has to be passed to them.
New “delete”, “merge” and “import_from_graph” methods
A new “delete” method was added to give the user the possibility of marking an entity as to be deleted: such an entity will be
effectively removed from the persistent storage during the “store” operation. This method is called mark_as_to_be_deleted
and it takes
as input a boolean parameter which represents the value that will be put inside the to_be_deleted
flag of the entity.
In order to give the user the ability to merge together two entities of the same type, an appropriate “merge” method was added to each
entity class. From now on, on each GraphEntity
instance it will be possible to call a method called merge
. Executing A.merge(B)
,
with A and B being two entities of the same type (instances of the same class), means that every triple from B is added to A (eventually
overwriting corresponding triples of A) and then that B is marked as to be deleted. Before being stored back in persistence, the entity
A can be merged with several other entities Bi (with i=1..n): a list of references to all these Bi instances that
were merged into A is kept in the field A.merge_list
.
Finally, the possibility of reconstructing proper GraphEntity
instances starting from data imported from various sources was explored.
A new “import_from_graph” function is now able to enrich a GraphSet
instance with entities extracted from a generic rdflib.Graph
. As
for now, the graph must be provided by the user and the data it contains can come from whatever source. In future, we should consider
writing support functions which could help the user retrieving entities directly from online triplestores, RDF text files, etc. …