We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

INTEROPERABILITY - Handling IIIF and Linked Data in the Handschriftenportal

00:00

Formal Metadata

Title
INTEROPERABILITY - Handling IIIF and Linked Data in the Handschriftenportal
Title of Series
Number of Parts
14
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The new manuscript portal for Germany - Handschriftenportal - will become the central information platform for medieval and early modern manuscripts in German collections, providing descriptive information on manuscripts as well as digital facsimiles. In a decentralized approach, digital images from various institutions will be integrated leveraging the International Image Interoperability Framework (IIIF). Both images and texts will later be the target of user generated annotations. This does and will provide various challenges for the Handschriftenportal, such as: publishing reliable URIs for digitized manuscripts with truly persistent content in the institutions’ backends; linking portal data with external authority data; linking annotations with author information; handling born-digital texts in the context of the IIIF image viewer Mirador; enhancing annotations to citable and persistent micro-publications. Also, there should be ways to feed information on user-generated content back to the institutions holding the original manuscripts. The talk will report on the project's current implementations and future plans, while - at the time SWIB 2021 will take place - the Handschriftenportal will just recently have gone live. The Handschriftenportal is a DFG-funded joint project of the State Libraries in Berlin and Munich, the Herzog August Library Wolfenbüttel and Leipzig University Library.
Computer animation
Transcript: English(auto-generated)
We, in the recent years, we had several interesting talks from Leipzig University and the IT department of the library is known because they do frequently do interesting projects. And one of this is they are involved in IIIF development and now
Annika Schruhe will show us the recent development, what's going on there. And for the audience, please prepare your questions during the talk and feel free to put comments in the chat. So after the talk, we will directly go into discussion. Okay.
Okay, thank you. So after this big picture that Patrick drew, I am planning to zoom into our own project and to talk about how we are handling IIIF and big data to become a little more interoperable.
So at the very beginning, I'd like to give you an idea who stands behind this project named Hanschlifenportal. We are a collaborative project between four partner institutions in Berlin, Leipzig, in Munich, and in Wolfenbüttel. We have regular consultations with the German manuscript centers, four of which are at the
partner institutions. So they are bound in. And we also have an academic advisory board. The project itself is funded by the German Research Foundation, and we are currently in the first project phase from 2018 to the end of this year.
Due to COVID, we got an extension until February 2022. But at the moment, we are in the application process for yet another grant for a second project phase that will hopefully start in 2022, so next year.
Our goals is that we want to become the one central information platform for medieval and early modern manuscripts in German collections. So the Hanschlifenportal will supersede the current Portland manuscript and media value. We're handling different content. Mainly, there are descriptive information on manuscripts. I'll come to that in a minute.
And digital facsimiles, of course, of the original manuscripts. So what we want is the user to be able to discover, view, compare, annotate manuscripts and the data concerning them, so to do research within the portal itself.
All that we are doing is based on interoperable exchange formats, not only as an export format, but our backends are working with them. We're using TI XML for the descriptive information on manuscripts and, of course, AAAF for the digital facsimiles. So this is a glimpse at our landing page where you can see our search and workspace areas and where you can easily start a first search.
The result list then is like the rest of the portal manuscript or object centric, meaning that we, of course, index and
search everything that we have in the portal, every content, but we pull it up at manuscript level to show to the user. So we have this manuscript or object and we have the descriptive text linked with it and digital facsimiles, basic metadata and so on.
All of this having its own URI. And those persistent URIs for each manuscripts, they are resolving to an HTML representation of the basic metadata, as well as all the other content that is available in the portal. So mainly our descriptions and the digital images that can be added to our workspace over here.
Yes, I said that I would say a little more on our academic manuscript descriptions because they are simply not comparable to use a usual bibliographic metadata that we are used to in libraries. What we have here are full texts.
They are a product of an academic process. They are very detailed discussions of the origin, physical appearance, language, text, illustrations, so everything you can think about when you have a manuscript. As they are a published resource, they can themselves become the focus of further academic discussion.
Yes, that's a part of the XML-TEI that is a manuscript description. So you have the authors here, for example, you have write statements, license statements.
And you can see that we are currently migrating data from manuscript and media value to TEI to use that in the Hanschriften portal, of course. The other part were images, of course, where we have a completely decentralized storage via IIIF.
So there's not a single image that is hosted by the Hanschriften portal itself, but we only register the IIIF manifest links and integrate them. So everything is stored at the original repositories. You can use our filters to filter down to manuscripts with IIIF resources that you can open in our workspace here.
So at the moment there are very many IIIF manifests. Well, what is IIIF? The acronym itself, it stands for International Image Interoperability Framework.
I can really recommend to go to this page, for example, to get a few introductory words and to know how it really works. I can't do it in this time, unfortunately. There will be a workshop tomorrow by Leander Zeige and Carsten Heck about IIIF and the wild.
That will also be very interesting. I hope and I will only give you a very brief idea of what IIIF is about. So this, I think I can call the traditional way of presenting images and metadata.
So every institution has their own stores and their own presentation and website, which normally works pretty well, but doesn't have any interoperability, of course. And that's where IIIF comes into play. So what it really does is basically defining APIs, how to handle images and metadata.
And they are RESTful APIs using JSON-LD. And if every store here just supports IIIF and those APIs, then on the other hand, every presenting website can not only present the content of their own stores, but without any further to present everything from the other stores as well.
So that's just the core concept of IIIF. That's a glimpse into a manifest, what can maybe be called the core of the presentation API.
So every manifest has its own hopefully persistent URI that is the anchor to integrate into any other services. Does have very few, only for visual purposes, metadata here, only key value metadata.
And then does have those sequences. They are a little metaphorically, so we do have canvases, every page is a canvas and every image is a painting, an annotation that is painted on the canvas. So the manuscript in the end is just a collection of links.
Yes, why do we use IIIF? As I said it before, we want people to work collaboratively on digital images and manuscripts from different collections. So our digital facsimiles come from all the participating institutions.
We want people to be able to display them and compare them also with non-portal content. We want to make use of workspace functionalities that come with Mirrodos 3. We want people to annotate images and descriptions in a reusable format. And above all, we also want to have people reuse our portal content and their own products.
Technically, there is another advantage of course, so we can make use of a very wide range of open source tools and interact with a really vivid community.
Where do we use IIIF? A very brief summary of our software architecture of the front office. That consists primarily of our search and workspace functionalities and is implemented as a React application. We have an Apache Sola as a discovery service in the background and Mirrodos 3 as the workspace viewer.
All our digital facsimiles are integrated via IIIF. Also printed catalogs were digitized and made available via IIIF. I'll come to that in a minute. We have a hosting service for small institutions in the second project phase and our annotations will also be web annotations compatible with IIIF.
Yeah, I just said it. Printed catalogs, so we have on the one hand those born digital manuscript descriptions that I showed before.
But we also came with a few hundred bound book catalogs that were scanned at OCR. And where we created or are in the process of creating IIIF manifests for them.
They are of course published content so we can't have CC licenses here. But still this is IIIF and is reusable in a way and can be used in our portal to see those catalogs side by side with all the other content. And to have all the features IIIF offers including text overlay features and an index and so on.
Yes, the catalogs as well as all the other content will be annotable. This is a preview so to say in the Hanschoffen portal just a proof of concept to integrate Mirrodos annotations and they will follow in the second project phase.
Yes, some other interesting features that I'd like to mention in this audience here is our TI XML manuscript descriptions.
They are integrated into the Mirrodos 3 plugin now with client side HTML rendering. For those of you who might have attended ELAC or the IIIF CON in the summer, you are part of our decision making process there and we have it implemented now.
The authority data service that is mainly implemented by the Berlin State Library where we are using GND data at the moment and currently for linked institutions and places and offices. So our enhancements there are more data of course, integrate more data from different sources and to directly integrate this into the search and presentation functionalities.
Yes, to put it in a nutshell what IIIF and linked data is doing for us here. So we have the advantages of being able to compare descriptions and images side by side in a flexible workspace and alongside with external content from repositories all over the world.
But still there are some challenges left of course as always. So currently we are asking ourselves how we can inform holding institutions about new annotations on their content and possibly even feed them back to the original repositories.
So maybe this is a possible use case for linked data notifications that we just heard about or for the IIIF Discovery API's activity streams. And another question that is even more important to us and maybe a little more swippish is how
can we make sure that those annotated images from external sources will stay persistent at manifest eye level as well as at canvas level which seems a little out under the radar of institutions at the moment. For us this is very important especially in the context of citable and persistent annotations because
the targets of the annotations are the canvases and they need to stay persistent of course. So can we really rely on the principles of linked data here? So do we really have cool URIs that don't change? It will stay interesting so please stay tuned and thank you for listening.