We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

HTTP-PATCH for Read-Write Linked Data

00:00

Formal Metadata

Title
HTTP-PATCH for Read-Write Linked Data
Title of Series
Number of Parts
16
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Talk by Rurik Thomas Greenall, Computas AS, Norway. Title: HTTP-PATCH for Read-Write Linked Data Abstract: It can be argued that HTTP-PATCH is essential to read-write linked data; this being the case, there seems to be no absolute definition for how this should be implemented. In this talk, I present different alternatives for HTTP-PATCH and an implementation based on practical considerations from feature-driven development of a linked-data-based library platform at Oslo public library. Grounded in the work done at Oslo public library, I show how HTTP-PATCH can be implemented and used in everyday workflows, while considering several aspects of specifications such as LD-PATCH, RDF-PATCH, particularly in light of existing efforts such as JSON-PATCH. In the description of the implementation, I pay particular attention to the practical issues of using linked data in REST architecture, the widespread use of formats that do not support hypermedia and blank nodes. The talk views the cognitive constraints imposed by the dominance of the traditional library technology stack and how these colour development of new workflows and interfaces. Further, I provide some thoughts about how specifications like the linked-data platform can be reconciled with modern development techniques that largely shun such specifications, and how we can create read-write interfaces for linked data. SWIB15 Conference, 23 – 25 November 2015, Hamburg, Germany. http://swib.org/swib15 #swib15
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Transcript: English(auto-generated)
So, apart from being the kind of person that uses all of the available fonts on the slide, which I apologise for, yeah, I'm a consultant and it doesn't necessarily sit very easily
for a person with a background in academic libraries working for a consultancy. But I'm actually talking about an open-source project, so this isn't going to be a sales patter from me. OK, so what we're trying to build is a new library system, and new and
library system, I don't know, that seems a bit of an odd idea again, I tend to say the same thing about this. And if you look on the right of the image there, there are some traditional concepts and they're obscured behind something called an API, and I'm not
actually going to talk about any of that stuff, I'm not going to talk about the problems that are actually solved already in libraries, and I think these are largely solved problems. I'm not going to talk about things to do with patrons either, I'm just going to talk
about this little bit of the stack here to do with cataloging, linked data API, and the linked data that we have at the bottom of our system. And the thing that all of these have in common is, there we go, HTTP patch. And HTTP patch runs through the entirety
of our stack, and yeah, I think it's quite interesting. Can I have a quick show of hands who actually knows what HTTP patch is at all? Or a few hands, not many. And I should
admit that when I submitted the abstract for this talk, I became apprehensive about presenting on this topic because it's quite technical. I got some feedback from the program committee which made me think, yeah, we can work this out a bit. I'm not sure whether
or not I've done enough work, to be honest, to make this relevant if you're only really interested in metadata, but we'll see. Yeah. Okay. HTTP patch is an HTTP method, and you're probably familiar with some of these methods. Post get, they should be familiar.
If you do any work with the web at all, put, a bit less so, and delete, possibly not really at all. Those give us the crud things that our database friends are very fond of. So with
these methods, you have all of the possibilities that you need to interact with the database, and why do you need to do something else? And I'll have a look at that in a moment.
Right. I've done all of the examples in JSON, but it's not linked data. It's just some rough stuff that I put together just as examples. So yeah, please don't ask questions about the examples, but yeah, everything's in JSON because it's so readable, and I don't
necessarily agree with that, so a bit of a lol anyway. Okay. So post and put can mean similar things. If I want to create a resource, I can say put resource, and then name the
resource, and then I can actually create a resource in this way. If I don't know what the resource is called, I can actually just say, hey, I want to create a resource, give me, and what you get returned is some information about where that resource is then located. Now, put seems like a usable strategy here, so yeah, I'm going to put something
on the server. However, I don't agree with that strategy at all. Yeah, this means, if you don't speak Norwegian, I was going to say, if you don't speak German, this means cam doesn't mean that you should because if you're putting things onto a server and
telling it where to put it, you're actually instructing the server in a way that I think is a bit strange. A client probably shouldn't instruct a server in this way. So I'm going to say, let's not do that. So in a normal situation in our system, I'd like to be able
to ask for a resource, say I want to put an animal. I get a response with a header saying where this is then located, so I get back a resource one in this case, and then I can put data on that, and then I'm updating the resource. Okay. Then I think
that I'm putting data in. I'm calling the cat Mr. Tibbles, I'm getting the response,
yes, I've created Mr. Tibbles, that's fine. And then I realize, oh, hang on a minute, the cat's not called Mr. Tibbles, it's called Felix, in fact. And then I update it in this way by putting. But actually, I haven't done what I thought I was doing. What I've actually done was overwritten the results in its entirety because put only
allows you to put an entire resource. You actually replace the entire, the resource in its entirety. So now I've got just a name Felix with no type. So yeah, we've intended to update something, but what we've done is replace it entirely. So in order
to update with put, you need to send lots of information. So if you have a long record, then you're going to be putting a lot of information, sending a lot of information upstream. And this is the point I can say, I'm actually not interested in records,
I'm interested in resources because we're doing linked data. And you might think that that's splitting hairs, and it's colored by interpretation. And this is where I start my rant about how perhaps saying everything is a record, everything is a document, perhaps as well. That's also colored by an interpretation. So,
yeah. I don't necessarily need to go there, but except for the moment I'm saying, if we update a result in this way, I want to say add data incrementally. So why? And I
think that patch actually provides you an idiomatic way of doing what you think you're doing. You're actually saying add data bit by bit. And this fits in quite well with triples. You can create a nice responsive interface without doing very strange things, you know, masking what you're actually doing. You're actually doing
post or put. You can actually say, okay, here's a triple, add a triple, here's a triple, add a triple. You can create a different kind of interface more idiomaticly and more simply. It's lightweight, powerful, and it allows us to do new things in new ways.
It allows us to think about cataloging, certainly, in a different way. So how does it work? Well, patch is derived from the common understanding of patch. And when I say
common, I mean computer science understanding, actually. And it's basically a description of the differences between two states of a resource. You can disagree with my terminology later, if you wish. In terms of a previous example, we describe the
change we want to make. We remove Mr. Tibbles and add Felix, and then we get a new state. So document B then has the changes we wanted. And that's easy. Now, easy is a word I've learnt that I'm not allowed to use, because I say this to my
developers quite a lot. This is easy. It's easy to think about, but actually it's not necessarily easy to do. And to implement this, there are some issues. For example, there's no widely implemented, at least, way of patching RDF in this way. There are, however, approaches to
patching JSON. And JSON patch describes a set of operations that can be used specifically with JSON documents. And we're in this JSON thing again. This is a topic with
me that perhaps you'll notice. It doesn't necessarily sit well. So yeah, let's look at this. We have an operation here. We're saying we want to add some data. We have a path. We say we want to add this to the path of the document that's in the path ABC. And then we have the value that we actually want to add. And then you get a value in your JSON document at ABC.
Now, JSON patch is widely implemented for patch, actually. And patch, as we maybe know already, is very, is actually unused, basically.
Is there anyone that has actually implemented patch in their system? One. Any others? No. Well, there you go. So, yeah. There's also the JSON document, and it has a few supported operations that actually
I don't need at all. I need add and delete, but we'll get back to that. And then it's using something else to select things in the document, and the extent to which JSON pointer is actually used, I don't really know. I'm guessing not so much. Similarly, there's an XML patch format. And there you've got the operation. You've got where
to add in the document. And then you've got the child that you're adding here. And what you're doing here is basically XML. It's an XML workflow. And, yeah, we're not doing XML either. We're doing RDF.
So, from my point of view, we have a few issues. And what I've noticed is that we use Java platform, and if you like Java abandonware, then patch, these kinds of patch formats are really well supported in that way.
Because basically all of the libraries were developed a few years ago, and then development just stops for some reason. Because our workflow is oriented towards triples, the extent to which we work with documents, we're not really doing that. We just talk about triples, statements, and resources.
And I think that it's this focus on RDF that should actually make it so that perhaps we need to look somewhere else. And that was what we did.
We could have done some stuff to say, OK, well, we're patching RDF. We know it's RDF, but we can take some other format in and we can manipulate that. I want an API that's more direct than that. I want something that the developers can actually recognize.
And since RDF is really simple in its structure, I think that the functions, as I said before, add and delete are enough. And then we've got JSON-LD. How many of you use JSON-LD? There's quite a lot, yeah. Raise your hand if you think it hasn't been completely painless using? Because I think it's a bit of a mess, I'm sorry.
It's not been the perfect experience for us. We've actually realized along the way that using a lot of N triples would have been a lot simpler, but because we're using JSON-LD.
But because we're using JavaScript, we end up using JSON-LD. But anyway, we don't need the complexity of JSON-LD. And when we're trying to patch with JSON patch, patching JSON-LD, it doesn't really understand it. And so we have to add more trickery somewhere else in our system. And yeah, I'll just move on, I think.
And there are some existing approaches that have been documented to doing patching of RDF. I'll just quickly cover these.
Sparkle update actually does patching. That is basically what sparkle updates does in the simplest sense if you're inserting and deleting data. And it's a useful thing. You can use that perhaps lower down in the stack, but I wouldn't want to have all of the possibilities in sparkle update available to do simple patching in this way.
And this brings us into sparkle patch, which is basically addressing the problems I just mentioned, but removes a few of the options, but it's still basically sparkle update.
And then there's turtle patch, which provides turtle-like syntax for patching. And again, it's based on sparkle update typically. And then there's RDF patch, which I think provides the cleanest way of patching. You can add, you can delete, as you see here.
Now, I have some opinions about all of these things, but they all stay in the share issues. They've not been implemented in the languages we use. And, yeah, they're perhaps looking at details that we're not interested in. So
we actually implemented something in the language that we do use, which is Java. And we are actually attempting to do something much simpler than the other patch formats. We're not solving the problems of patching arbitrary RDF. We're fixing a particular problem, which is doing that thing in our context.
Which is updating linked data in a library system. So, we ended up with JSON. As I said, we would have preferred ntriples, because that's cleaner when you're working with RDF. But we ended up with JSON, not JSON-LD, because basically it's something that the
developers were familiar with, and something that we could use with a JavaScript client quite easily. As I said, we only need add and delete, and we don't have bnodes. How many fans of bnodes do we have in the room?
For us, bnodes are basically a thing that we try to avoid at all costs, and we've not actually come upon a case where we've needed them at all. And that's in a year and a half's development, so we've been able to find simpler solutions in using other technologies.
OK. So, we followed JSON-patch's lead, but we also looked at RDF-patch, because it basically matched our needs much better. It was much simpler. We ended up with rather verbose objects, and I think that's unavoidable without unpleasantness in
interpreting JSON as anything other than basically strings, which is what you have in JSON. And I'm a strong believer in having explicit data and perhaps less smart software. And it looks like this.
A patch format can have an operation, the OP here. That can be add or delete. It has a subject, a predicate, and it has the object, and the object is the thing that isn't a string. It's an object. And let's have a look at a real-world example. So here we're adding, basically, we know where we're adding it, we've
got the subject, we know what the predicate is, we have a value, and the value actually here is specified as being URI. And this is a little bit of a tricky thing. It was because we couldn't find a very, very simple way of distinguishing whether or not this is a URI or a plain literal.
We've got an example of a plain literal that would get interpreted as an RDF plain literal, in our case. We can specify a lang string. We can have other data types. As I said, access to URI is reserved for the URIs that we're using.
And then you can combine them as an object. You can combine as many of these as you wish, so you can actually generate a large patch document. We typically do it two and two if we're deleting and replacing, or we add individual triples, so what we're sending is very small chunks of data.
But you could send big chunks, there's no real issue. Okay. Yep, that's the recap.
Okay, implementation. It's Java, basically. But it's worth noting that at the bottom of the stack it all ends up as a sparkle update query anyway. And, yeah, I think that's fair enough. I think it's an understandable way of doing it.
We have cores, so our client basically just reacts to the server using JSON, and everything works fine. And that's it. The things we've learned from doing this. Basically, running code is running code. If it works, do it.
We could have spent time waiting for a standard, or in fact developing a standard, but we don't need very much for what we're doing. We have quite a limited spectrum of things we needed to do. And that's, I think, the thing. You should probably only implement the things you needed.
I spent quite a long time implementing lots of code to make it basically look like the linked data platform. And that was a waste of time. We actually stripped out the delete code, we stripped out the put code, because these weren't in use in the end at all. The delete code ended up being put back in because we use it for cleanup, but that came much later when the cleanup use case came into our workflow.
And I can say that our cataloging interface now relies entirely upon this patch format.
So, our way forward, it builds on the knowledge we've gained, making small steps, incrementally deconstructing and reconstructing bibliographic data. And patch helps us to do this, because it actually is very deconstructed.
And we have a current sketch of our cataloging interface that says, okay, we start with a creator, and then we dig down through a work, and then finally towards publications or perhaps manifestations, if you prefer that terminology. And using patch actually has simplified the way we interact with the data to the extent
that it's allowed the user experience guy to do his thing in a completely different way. We've moved away from looking at records, and I think that this is one of the positive outcomes of using this particular technology. And it's all done, triple by triple, using patch. And I think I've come to an end there. So, yeah. Talk to me.
I'm on Twitter. I tweet about bad library jokes, this one, and libraries and socialism, mostly.
And I'm going to make the unusual thing of ending with a joke. Thank you. Okay. Thank you very much.