CIDOC CRM

I’ve been exploring the CIDOC Conceptual Reference Model (CRM) with an eye toward using it in Paradicms. The CRM is an abstract data model for cultural heritage information. It specifies the semantics of an entity class hierarchy and a hierarchy of properties that entities can have.

Syntax is left to implementors. A group in Erlangen has published an OWL implementation of the CRM. I’ve seen examples in an ad hoc XML syntax, as well. Beyond that guidance and examples for implementors are hard to come by.

One way I learn new technologies is by relating them to technologies I already know. Over the long weekend I wrote a script to translated the Erlangen CRM OWL to Python 3 @dataclass definitions for entities and properties. The CRM hierarchies are directed graphs with multiple inheritance rather than simple trees. There are a few places that ran afoul of Python’s rules for multiple inheritance (MRO serialization), which required some tricks to circumvent.

A team at the Getty has done something similar, translating a different version of the CRM into Python using metaclasses to generate entity and property classes at runtime, rather than pre-generating .py files as I do. I prefer the latter approach because having the classes on disk makes it easier for IDEs like PyCharm to provide autocomplete and other support for exploring the code.