HTTP-PATCH for Read-Write Linked Data
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 16 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/47529 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
3
8
10
00:00
Computer animation
00:17
Computer animation
00:38
Computer animation
01:10
Computer animation
01:38
Computer animation
02:32
Computer animation
03:01
Computer animation
03:23
Computer animation
05:31
Computer animation
05:55
Computer animation
06:17
Computer animation
06:44
Computer animation
07:15
Computer animation
08:16
Computer animation
08:55
Computer animation
09:23
Computer animation
10:09
Computer animation
10:56
Computer animation
11:24
Computer animation
12:15
Computer animation
12:49
Computer animation
13:48
Computer animation
14:25
Computer animation
14:51
Computer animation
15:25
Computer animation
17:19
Computer animation
17:41
Computer animation
18:03
Computer animation
18:46
Computer animation
19:34
Computer animation
20:41
Computer animation
21:47
Computer animation
Transcript: English(auto-generated)
00:05
So, apart from being the kind of person that uses all of the available fonts on the slide, which I apologise for, yeah, I'm a consultant and it doesn't necessarily sit very easily
00:24
for a person with a background in academic libraries working for a consultancy. But I'm actually talking about an open-source project, so this isn't going to be a sales patter from me. OK, so what we're trying to build is a new library system, and new and
00:46
library system, I don't know, that seems a bit of an odd idea again, I tend to say the same thing about this. And if you look on the right of the image there, there are some traditional concepts and they're obscured behind something called an API, and I'm not
01:05
actually going to talk about any of that stuff, I'm not going to talk about the problems that are actually solved already in libraries, and I think these are largely solved problems. I'm not going to talk about things to do with patrons either, I'm just going to talk
01:22
about this little bit of the stack here to do with cataloging, linked data API, and the linked data that we have at the bottom of our system. And the thing that all of these have in common is, there we go, HTTP patch. And HTTP patch runs through the entirety
01:48
of our stack, and yeah, I think it's quite interesting. Can I have a quick show of hands who actually knows what HTTP patch is at all? Or a few hands, not many. And I should
02:06
admit that when I submitted the abstract for this talk, I became apprehensive about presenting on this topic because it's quite technical. I got some feedback from the program committee which made me think, yeah, we can work this out a bit. I'm not sure whether
02:21
or not I've done enough work, to be honest, to make this relevant if you're only really interested in metadata, but we'll see. Yeah. Okay. HTTP patch is an HTTP method, and you're probably familiar with some of these methods. Post get, they should be familiar.
02:52
If you do any work with the web at all, put, a bit less so, and delete, possibly not really at all. Those give us the crud things that our database friends are very fond of. So with
03:10
these methods, you have all of the possibilities that you need to interact with the database, and why do you need to do something else? And I'll have a look at that in a moment.
03:25
Right. I've done all of the examples in JSON, but it's not linked data. It's just some rough stuff that I put together just as examples. So yeah, please don't ask questions about the examples, but yeah, everything's in JSON because it's so readable, and I don't
03:45
necessarily agree with that, so a bit of a lol anyway. Okay. So post and put can mean similar things. If I want to create a resource, I can say put resource, and then name the
04:03
resource, and then I can actually create a resource in this way. If I don't know what the resource is called, I can actually just say, hey, I want to create a resource, give me, and what you get returned is some information about where that resource is then located. Now, put seems like a usable strategy here, so yeah, I'm going to put something
04:27
on the server. However, I don't agree with that strategy at all. Yeah, this means, if you don't speak Norwegian, I was going to say, if you don't speak German, this means cam doesn't mean that you should because if you're putting things onto a server and
04:47
telling it where to put it, you're actually instructing the server in a way that I think is a bit strange. A client probably shouldn't instruct a server in this way. So I'm going to say, let's not do that. So in a normal situation in our system, I'd like to be able
05:05
to ask for a resource, say I want to put an animal. I get a response with a header saying where this is then located, so I get back a resource one in this case, and then I can put data on that, and then I'm updating the resource. Okay. Then I think
05:34
that I'm putting data in. I'm calling the cat Mr. Tibbles, I'm getting the response,
05:41
yes, I've created Mr. Tibbles, that's fine. And then I realize, oh, hang on a minute, the cat's not called Mr. Tibbles, it's called Felix, in fact. And then I update it in this way by putting. But actually, I haven't done what I thought I was doing. What I've actually done was overwritten the results in its entirety because put only
06:01
allows you to put an entire resource. You actually replace the entire, the resource in its entirety. So now I've got just a name Felix with no type. So yeah, we've intended to update something, but what we've done is replace it entirely. So in order
06:24
to update with put, you need to send lots of information. So if you have a long record, then you're going to be putting a lot of information, sending a lot of information upstream. And this is the point I can say, I'm actually not interested in records,
06:41
I'm interested in resources because we're doing linked data. And you might think that that's splitting hairs, and it's colored by interpretation. And this is where I start my rant about how perhaps saying everything is a record, everything is a document, perhaps as well. That's also colored by an interpretation. So,
07:03
yeah. I don't necessarily need to go there, but except for the moment I'm saying, if we update a result in this way, I want to say add data incrementally. So why? And I
07:23
think that patch actually provides you an idiomatic way of doing what you think you're doing. You're actually saying add data bit by bit. And this fits in quite well with triples. You can create a nice responsive interface without doing very strange things, you know, masking what you're actually doing. You're actually doing
07:41
post or put. You can actually say, okay, here's a triple, add a triple, here's a triple, add a triple. You can create a different kind of interface more idiomaticly and more simply. It's lightweight, powerful, and it allows us to do new things in new ways.
08:06
It allows us to think about cataloging, certainly, in a different way. So how does it work? Well, patch is derived from the common understanding of patch. And when I say
08:21
common, I mean computer science understanding, actually. And it's basically a description of the differences between two states of a resource. You can disagree with my terminology later, if you wish. In terms of a previous example, we describe the
08:42
change we want to make. We remove Mr. Tibbles and add Felix, and then we get a new state. So document B then has the changes we wanted. And that's easy. Now, easy is a word I've learnt that I'm not allowed to use, because I say this to my
09:00
developers quite a lot. This is easy. It's easy to think about, but actually it's not necessarily easy to do. And to implement this, there are some issues. For example, there's no widely implemented, at least, way of patching RDF in this way. There are, however, approaches to
09:26
patching JSON. And JSON patch describes a set of operations that can be used specifically with JSON documents. And we're in this JSON thing again. This is a topic with
09:40
me that perhaps you'll notice. It doesn't necessarily sit well. So yeah, let's look at this. We have an operation here. We're saying we want to add some data. We have a path. We say we want to add this to the path of the document that's in the path ABC. And then we have the value that we actually want to add. And then you get a value in your JSON document at ABC.
10:10
Now, JSON patch is widely implemented for patch, actually. And patch, as we maybe know already, is very, is actually unused, basically.
10:20
Is there anyone that has actually implemented patch in their system? One. Any others? No. Well, there you go. So, yeah. There's also the JSON document, and it has a few supported operations that actually
10:43
I don't need at all. I need add and delete, but we'll get back to that. And then it's using something else to select things in the document, and the extent to which JSON pointer is actually used, I don't really know. I'm guessing not so much. Similarly, there's an XML patch format. And there you've got the operation. You've got where
11:05
to add in the document. And then you've got the child that you're adding here. And what you're doing here is basically XML. It's an XML workflow. And, yeah, we're not doing XML either. We're doing RDF.
11:25
So, from my point of view, we have a few issues. And what I've noticed is that we use Java platform, and if you like Java abandonware, then patch, these kinds of patch formats are really well supported in that way.
11:40
Because basically all of the libraries were developed a few years ago, and then development just stops for some reason. Because our workflow is oriented towards triples, the extent to which we work with documents, we're not really doing that. We just talk about triples, statements, and resources.
12:06
And I think that it's this focus on RDF that should actually make it so that perhaps we need to look somewhere else. And that was what we did.
12:23
We could have done some stuff to say, OK, well, we're patching RDF. We know it's RDF, but we can take some other format in and we can manipulate that. I want an API that's more direct than that. I want something that the developers can actually recognize.
12:41
And since RDF is really simple in its structure, I think that the functions, as I said before, add and delete are enough. And then we've got JSON-LD. How many of you use JSON-LD? There's quite a lot, yeah. Raise your hand if you think it hasn't been completely painless using? Because I think it's a bit of a mess, I'm sorry.
13:11
It's not been the perfect experience for us. We've actually realized along the way that using a lot of N triples would have been a lot simpler, but because we're using JSON-LD.
13:21
But because we're using JavaScript, we end up using JSON-LD. But anyway, we don't need the complexity of JSON-LD. And when we're trying to patch with JSON patch, patching JSON-LD, it doesn't really understand it. And so we have to add more trickery somewhere else in our system. And yeah, I'll just move on, I think.
13:49
And there are some existing approaches that have been documented to doing patching of RDF. I'll just quickly cover these.
14:01
Sparkle update actually does patching. That is basically what sparkle updates does in the simplest sense if you're inserting and deleting data. And it's a useful thing. You can use that perhaps lower down in the stack, but I wouldn't want to have all of the possibilities in sparkle update available to do simple patching in this way.
14:26
And this brings us into sparkle patch, which is basically addressing the problems I just mentioned, but removes a few of the options, but it's still basically sparkle update.
14:41
And then there's turtle patch, which provides turtle-like syntax for patching. And again, it's based on sparkle update typically. And then there's RDF patch, which I think provides the cleanest way of patching. You can add, you can delete, as you see here.
15:03
Now, I have some opinions about all of these things, but they all stay in the share issues. They've not been implemented in the languages we use. And, yeah, they're perhaps looking at details that we're not interested in. So
15:21
we actually implemented something in the language that we do use, which is Java. And we are actually attempting to do something much simpler than the other patch formats. We're not solving the problems of patching arbitrary RDF. We're fixing a particular problem, which is doing that thing in our context.
15:45
Which is updating linked data in a library system. So, we ended up with JSON. As I said, we would have preferred ntriples, because that's cleaner when you're working with RDF. But we ended up with JSON, not JSON-LD, because basically it's something that the
16:07
developers were familiar with, and something that we could use with a JavaScript client quite easily. As I said, we only need add and delete, and we don't have bnodes. How many fans of bnodes do we have in the room?
16:25
For us, bnodes are basically a thing that we try to avoid at all costs, and we've not actually come upon a case where we've needed them at all. And that's in a year and a half's development, so we've been able to find simpler solutions in using other technologies.
16:45
OK. So, we followed JSON-patch's lead, but we also looked at RDF-patch, because it basically matched our needs much better. It was much simpler. We ended up with rather verbose objects, and I think that's unavoidable without unpleasantness in
17:05
interpreting JSON as anything other than basically strings, which is what you have in JSON. And I'm a strong believer in having explicit data and perhaps less smart software. And it looks like this.
17:20
A patch format can have an operation, the OP here. That can be add or delete. It has a subject, a predicate, and it has the object, and the object is the thing that isn't a string. It's an object. And let's have a look at a real-world example. So here we're adding, basically, we know where we're adding it, we've
17:44
got the subject, we know what the predicate is, we have a value, and the value actually here is specified as being URI. And this is a little bit of a tricky thing. It was because we couldn't find a very, very simple way of distinguishing whether or not this is a URI or a plain literal.
18:03
We've got an example of a plain literal that would get interpreted as an RDF plain literal, in our case. We can specify a lang string. We can have other data types. As I said, access to URI is reserved for the URIs that we're using.
18:22
And then you can combine them as an object. You can combine as many of these as you wish, so you can actually generate a large patch document. We typically do it two and two if we're deleting and replacing, or we add individual triples, so what we're sending is very small chunks of data.
18:41
But you could send big chunks, there's no real issue. Okay. Yep, that's the recap.
19:00
Okay, implementation. It's Java, basically. But it's worth noting that at the bottom of the stack it all ends up as a sparkle update query anyway. And, yeah, I think that's fair enough. I think it's an understandable way of doing it.
19:20
We have cores, so our client basically just reacts to the server using JSON, and everything works fine. And that's it. The things we've learned from doing this. Basically, running code is running code. If it works, do it.
19:45
We could have spent time waiting for a standard, or in fact developing a standard, but we don't need very much for what we're doing. We have quite a limited spectrum of things we needed to do. And that's, I think, the thing. You should probably only implement the things you needed.
20:03
I spent quite a long time implementing lots of code to make it basically look like the linked data platform. And that was a waste of time. We actually stripped out the delete code, we stripped out the put code, because these weren't in use in the end at all. The delete code ended up being put back in because we use it for cleanup, but that came much later when the cleanup use case came into our workflow.
20:31
And I can say that our cataloging interface now relies entirely upon this patch format.
20:42
So, our way forward, it builds on the knowledge we've gained, making small steps, incrementally deconstructing and reconstructing bibliographic data. And patch helps us to do this, because it actually is very deconstructed.
21:01
And we have a current sketch of our cataloging interface that says, okay, we start with a creator, and then we dig down through a work, and then finally towards publications or perhaps manifestations, if you prefer that terminology. And using patch actually has simplified the way we interact with the data to the extent
21:24
that it's allowed the user experience guy to do his thing in a completely different way. We've moved away from looking at records, and I think that this is one of the positive outcomes of using this particular technology. And it's all done, triple by triple, using patch. And I think I've come to an end there. So, yeah. Talk to me.
21:51
I'm on Twitter. I tweet about bad library jokes, this one, and libraries and socialism, mostly.
22:02
And I'm going to make the unusual thing of ending with a joke. Thank you. Okay. Thank you very much.