[OCWR] Week 3 - OpenCitations Weekly Report
Week from Aug 17 to Aug 23
Introduction
The goal of the third week was to complete the code refactoring of graphlib.py
. To get to that point, all the classes had to be
completed with their missing methods, a test for each class had to be written and the code had to be filled with docstrings
in order to provide some initial level of documentantion. I also had to write a proper README.md
file for the
online repository and I had to refactor the code from datasethandler.py
(have a look here) into two distinct Python modules: dataset.py
and distribution.py
.
Report
Added missing methods
Strictly following the latest version of the OCDM described in this document, I added the missing methods of each class
belonging to the GraphEntity
subclasses hierarchy. A list of them is provided below:
BibliographicResource
- has_edition
- has_related_document
Citation
- has_citation_creation_date
- has_citation_time_span
- has_citation_characterization
- create_self_citation
- create_affiliation_self_citation
- create_author_network_self_citation
- create_author_self_citation
- create_funder_self_citation
- create_journal_self_citation
- create_journal_cartel_citation
- create_distant_citation
ResponsibleAgent
- has_related_agent
ResourceEmbodiment
- create_digital_embodiment
- create_print_embodiment
- has_url
- has_media_type
DiscourseElement
- create_section
- create_section_title
- create_paragraph
- create_table
- create_footnote
- create_caption
ReferencePointer
- has_next_rp
Unit Testing
Inside the tests
folder of the oc_ocdm repository (prepared in advance during last week), I created a test module
for each oc_ocdm
’s class. Some tests may still be missing, but I hope to finish them as soon as possible.
Since many of the methods from oc_ocdm
’s classes are very similar to each other, the vast majority of the respective tests turned out
to be similar. Here is an example of the most common type of test:
def test_has_id(self):
result = self.entity.has_id(self.identifier)
self.assertIsNone(result)
triple = URIRef(str(self.entity)), GraphEntity.has_identifier, URIRef(str(self.identifier))
self.assertIn(triple, self.entity.g)
Documentation and README.md
During the week, I found very little time to be dedicated to documentation writing. Hence, I had only the possibility of writing class docstrings using the definitions provided by the OCDM. Docstrings for the methods are still missing.
I wrote a README.md
file with information about how to manage the Poetry package and how to run tests locally.
DatasetHandler code refactoring
I started the DatasetHandler
code refactoring, creating two separate classes: Dataset
inside dataset.py
and Distribution
inside
distribution.py
. I added to both classes the methods from datasethandler.py
that respectively belong to each of them and I added Type Hints
for each method. Additional work may be needed in the near future.