2. Technical specification

The CGIF harvester for the Culture Knowledge Graph ingests data structured with four schema.org classes. Only these classes and the properties listed below are being ingested. All other triples are ignored.

In the following classes and properties, @context, @language, and @type are not explicitly listed since their notation differs across RDF implementations.

schema:DataFeed

A harvestable feed providing structured information about resources in a schema:DataCatalog. It is a subclass of schema:Dataset with a modification date, a pointer to the schema:DataCatalog it belongs to, and a number of schema:DataFeedItems. If the feed contains all items on a single page or in a single file, the more specific type schema:CompleteDataFeed may be used instead. In this case the harvester will take all items and skip looking for feed pagination.

Property Description
schema:name Name of a schema:DataFeed (like "Bildarchiv des Corpus Vitrearum Deutschland")
schema:dateModified The modification date of a schema:DataFeed. Used to determine if a feed needs to be re-harvested.
schema:includedInDataCatalog This container holds the schema:DataCatalog the feed is part of.
schema:dataFeedElement Each schema:DataFeedElement contains a number of schema:DataFeedItems.

In theory, a schema:DataFeed is a a type of schema:Dataset. To make this explicit for validators, however, schema:Dataset should be listed as a second @type in addition to schema:DataFeed.

If a schema:DataFeed is not self-contained but spread across several (frontend) pages of an online database, feed pagination can be expressed using the Hydra Core Vocabulary. In this case, the schema:DataFeed gets an additional type hydra:Collection as well as the properties hydra:totalItems and hydra:view, which holds further pagination controls (first, last, next, previous).

schema:DataCatalog

A collection of data feeds that is mapped to nfdicore:DataPortal in the knowledge graph. A schema:DataCatalog has at least an IRI, a name and a publisher. The publisher (usually schema:Organization) is identified by its Culture IRI.

Property Description
schema:name Name of a schema:DataCatalog (like "Corpus Vitrearum Deutschland")
schema:publisher Each schema:DataCatalog has a publisher referenced by its Culture IRI.

schema:DataFeedItem

Wrapper for a single item within a larger schema:DataFeed. When data is provided per resource instead of as a list, schema:DataFeedItem is left out and the wrapped item is provided directly.

Property Description
schema:item Describes an item in a schema:DataFeed. You specify the type of the object wrapped by this property by using a schema:Thing of some sort (see below).

In individual resource views, there is no need to add schema:DataFeedItem as a wrapper.

Item wrapped by schema:DataFeedItem

The item represents a resource for the knowledge graph with an IRI, a name, and a @type from the list of schema.org classes. This enables you to categorise the type of resource you are providing, like a schema:ImageObject for an image record or, most likely, another subclass of schema:CreativeWork. Optionally, each item can have a schema:temporalCoverage property (date range with ISO dates) and a number of terms from controlled vocabularies and/or authority files.

Property Description
schema:name Name of the item.
schema:license URL of the item's license.
schema:temporalCoverage The schema:temporalCoverage indicates the period that the item applies to in ISO 8601 time interval format. In addition, a string literal may also be provided, depending on your data.
schema:keywords Used to describe the content of the item with schema:DefinedTerms from controlled vocabularies and/or authority files. Each schema:DefinedTerm consists of an IRI and a reference to the vocabulary that defines it.
schema:image URL of an image representing the item or the item itself.
schema:lyrics (only valid for a schema:MusicComposition) Blank node using the @type schema:CreativeWork and the property schema:text containing the composition's lyrics.

schema:DefinedTerm

Defined terms from controlled vocabularies or authority files that describe a schema:DataFeedItem. For each term, the IRI to the according vocabulary or authority file needs to be provided.

Property Description
schema:inDefinedTermSet URL to the vocabulary or authority file that contains the DefinedTerm

The set of vocabularies that are being ingested into the Culture Knowledge Graph as schema:DefinedTerms is limited to facilitate linking and limit data residue in the graph. The following vocabularies are currently being observed when we ingest CGIF data:

Please contact us if you would like to suggest another vocabulary that you expect many data feeds from the culture domain to use.