2.7. Extending the iArt project

Extending the iArt project to meet the needs of NFDI4Culture tool makers

Project proposal context

iART is an s an open web platform for visual media research that facilitates the process of comparative vision. The system integrates various machine learning techniques for keyword- and content-based image retrieval as well as category formation via clustering. The iART platform has been designed in such a way that it can be easily extended with the help of plug-ins to meet the various requirements of research in the fields of art history and digital humanities. This modular architecture combined with its state-of-the-art machine learning approach, makes iArt a perfect partner project to much of the work dedicated to digitisation and data enrichment in Task Area 1 of NFDI4Culture. Thanks to the integration of a Wikibase instance with Kompakkt, annotation of various 2D and 3D media within a semantic environment is already possible with the MVP toolchain developed in TA1’s Measure 4. But combining the work already developed in the iArt project with the MVP toolchain via dedicated plug-ins would open up entirely new possibilities for semi-automated annotation enhanced via image recognition and an AI-driven recommendation system.

Crucial in this linking between the two existing projects is the new Annotation and Terminology service developed at TIB’s Open Scielance Lab in the context of Task areas 1 and 5 of NFDI4Culture (see also Gitlab repository). This service is designed to meet the need for a comprehensive terminology service within the cultural field and connect this service directly into a collection management system. The Antelope service will provide an API that connects a range of semantic terminologies directly within the interfaces of Kompakkt and Wikibase.

Deliverables

1) Data import customizations in iArt

The iArt project was extended to allow the usage of a new stateless indexer component, that operates independent from the existing iArt workflows and is therefore not interfering with the backend services and database model. The new indexer method is available via API calls over a GRPC protocol.

2) Adaption of the existing iArt system

The iArt plugin system was completely refactored to allow a more flexible setup of plugin modules (e. g. different AI models). The image classification system was updated to use a generalized openAI CLIPText Model instead of use-case specific neural networks. The image classification method was extended to upload images with the api call. This makes the classification process more flexible, as no images need to be persisted in the iArt database for the use of classification. As the CLIPText model is capable of delivering embeddings (positions in a vector space) for images as well as for texts, both functionalities were published via api to enlarge the opportunities of client processes.

3) Extension with new functions

The new implemented classification function was extended to use several new request parameters that allows to handle the method behaviour. This includes the usage of individual dictionaries. Instead of delivering the entities with the highest classification rates from the CLIPText vectorspace, the user is now capable of using his own controlled vocabulary for image classification. This allows the support of domain specific use cases like for example classification using the iconclass or getty vocabulary. The new iArt API methods were integrated in the Antelope service. This allows our users to make use of the iArt functionality via api and via web frontend and makes it easier to integrate them in existing research software along with other needed functions like text based entity linking and terminology search. The possibility to provide additional text input, describing the image was integrated to adapt the service to the requirements of use cases like catalogue annotation and other workflows using already existing metadata (e. g. for museums or art collections). The iArt functionality via Antelope is available at: https://service.tib.eu/annotation/

4) Documentation

All Antelope APIs are following a “specification first” approach using the openAPI specification framework. This ensures that the documentation is always up to date. The Antelope API specs are available online at https://service.tib.eu/annotation/v3/api-docs/openapi we are planning to extend the documentation by publishing the API docs within an interactive Swagger-UI form.

Kolja Bailly