BIBFRAME Pilot
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 15 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/47590 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
10
00:00
DiagramProgram flowchart
01:07
Computer animation
02:49
Computer animation
04:14
Computer animation
06:02
Computer animation
07:05
Computer animation
08:24
Computer animation
10:00
Lecture/ConferenceMeeting/Interview
12:36
Computer animationProgram flowchart
12:53
Program flowchartDiagram
13:14
Computer animation
14:31
Computer animation
14:57
Computer animation
15:17
Program flowchart
15:36
Program flowchart
17:04
Program flowchart
18:20
Computer animation
18:37
Computer animation
18:59
Computer animation
19:34
Computer animation
19:55
DiagramProgram flowchart
20:11
Computer animation
20:30
Computer animationLecture/ConferenceMeeting/Interview
20:55
Computer animation
Transcript: English(auto-generated)
00:01
Hi, I'm Nate Trail from Library of Congress, and I'm the lead developer for a bibframe Our colleague our boss Sally McCallum is not able to be here today So to just be ray and I ray will talk a little bit about ontologies and I'll be talking about the pilot We've had the pilot Number two going on for a number of months so far. So this is just a preliminary
00:24
Overview of what we're finding and what we're doing In order to get things started we had to build a bibframe database so that our catalogers could catalog against it and that consisted of all of the name titles and title authority records in our
00:43
LC naff at the id.gov as converted to bibframe plus a complete dump of our ILS records Bibliographic records converted to the new bibframe to vocabulary and then also Native bibframe from our bibframe editor as the catalogers code into it
01:10
The main focus of the pilot is really about data entry and conversion of the data and it's really not enough about what do you do on the back end or on the
01:22
Website of it which I think is something that we need to pivot and move toward now that we've sort of built a database for this the database that we have is sort of living it has daily feeds from both the ILS and from the name authority records and when the catalogers add new records those are merged into the database and
01:47
The records that they also have to key into the ILS Maintain the ILS the real ILS going forward are kept separate from that
02:02
So between bibframe 1 and bibframe 2 we've changed a lot of things the original vocabulary we've overhauled and so now we have bibframe 2 as a Vocabulary and we also realized that our infrastructure was inadequate for the amount of linking that was going to happen between
02:22
Our own data and others as well as internally so we have replaced all of the servers that were involved in ID look gov and We have added additional machines to to make it much more robust Our triple store was for store, which is an open source product, and it's no longer being supported
02:43
So we've switched that out and just recently we moved to allow HTTPS for all of our links We've also changed all the software most of the code base that we had originally was in xquery But now for our conversion in the
03:01
bibframe 2 vocabulary we're using XSL and one of the reasons for that was so that we could embed the Resulting tools in metaproxy and yes So they're instantly available when we make code changes to any library that updates their metaproxy and yes tools The comparison service online is now updated for bibframe 2 and we've added an authority conversion as well
03:25
so you can see what a name title authority looks like as a bibframe work and When we ingest everything into the database we do a merge For a bibliographic record to see if it can match to an existing work and all the merge programs had to be
03:43
updated for the new vocabulary as well The bibframe catalog that we created had to have a front end so we've got a new search and display interface for that and we're just starting to use sparkle to augment the display and Not for a whole lot of other stuff, but just a little bit of extra
04:02
Display stuff the editor itself was written for bibframe one So we've modified that to handle bibframe 2 and the profiles that we have are also now updated for the new vocabulary To A little comparison between ID and bibframe as far as triples
04:22
ID has about ten and a half million records that are names and subjects and other smaller Vocabularies they represent it has about 300 million unique triples 21 million subjects only 768 predicates, which is Interesting when you look at down below how many predicates there are for the bibframe database
04:43
I am not really delved into that to see why exactly there might be 14,000 unique predicates, but we'll see The number of triples in the bibframe database there's 65 million Descriptions and that ends up being 4 billion triples because
05:01
When you do a bibframe conversion It's very wordy in order to allow us to do this merging and matching and at some point soon. I'm going to be Deleting the things that were necessary for ingest but not necessarily for the ongoing bibframe data
05:21
So when we merge we started with the base file of name titles. There's about 1.2 million of those and 19 million bibliographic descriptions were added to that and After the merge About 1.2 million works had something merged onto them Not just their own records not that just their own instance necessary necessarily, excuse me
05:45
But only 530 million instances were merged on to one of the name authority works so the other 700,000 or so Merged on to bibs that came before them and I'll show you an example of one of those
06:02
so this is a title runaway mittens and There are two different editions apparently and if you look on the right hand side The one that's in bold is this particular instance and the sibling right below that with the hyperlink. That's the
06:22
Instance that was merged onto the same work. So it's available for viewing as well Just a little bit of the sparkle that we started to use for that instance You don't necessarily know on an instant what its parents title is or what other similar siblings are available what other instances?
06:43
Belong to the same work. So we just did a simple query that says okay if you have an instance of some work Go find that work and get its Bibframe title and bring it back and you can also say well Go find all of the things that are instances of that same object and get their titles
07:04
And so we have that Some of the issues that we've encountered We're using RDF XML for The conversion from the for the bib records, but the bib frame editor is a JSON editor and so
07:23
Before we start doing ingest we have to bring everything to the same structure so that we can do ingest the same way There's a huge number of triples involved and we're still trying to figure out how to limit that and index only the stuff that is necessary When we do our merges
07:41
there's a lot of candidates for maybe you shouldn't merge this for example a Photograph doesn't necessarily have a title. So it says untitled but not all untitled work should be merged together So We we actively suppress some things from being merged and
08:01
The conversion from mark is still a moving target. So every time the conversion spec changes and Indexed data makes an update we have to figure out how does that update flow into the database? We can easily change the conversion on the way in but all the prior records we have to figure out. Can we Update those on the fly in some way or do we need to reload everything?
08:27
so some of the work that we can Still look forward to is trying to figure out how to expose the bib frame catalog right now It's behind the firewall and we're still trying to figure out what it looks like and and how to how to come to grips With it for ourselves, but we might do some kind of a bulk download
08:44
I would really like to see something more Less cataloging focused and more web focused. So maybe a new RDF navigation interface, but We have not got a plan for that yet we're still definitely looking at the data that's coming in and
09:02
We also would like to ingest new sip records and onyx records and convert those to bid frame the editor has a lot of issues and I'm sort of tempted to just change it to a Simple HTML form and convert to bid frame on the back end We're always looking for new services at ID. So we need your input for what else you want to see from
09:24
The systems that we have and soon we'll have spec for holdings. So we'll start ingesting items as well and These are some links to the converters and documentation stuff like that. Thank you very much
09:48
Hi, my name is Ray Denenberg. I'm also with the Library of Congress I'm going to talk a little bit about the ontology the bib frame ontology And this is pretty much a condensed version of of what I talked about yesterday in a two-hour session for those of you were there
10:05
and those of you who weren't there and who are interested in some of the points that I talk about my understanding is that the presentation yesterday will also be on the Among the conference proceedings and so you can you can you can see a number more further examples of what I give today
10:26
so I start by saying that the the development of the bib frame ontology was driven by a number of principles and two of which were simplicity and extensibility
10:40
And by that I mean that When a particular feature was Suggested to us and keep in mind that bib frame is intended to be a core Bibliographic ontology so when a particular feature was suggested to us We evaluate in deciding whether to support it or not. We usually would evaluate whether it was a core bibliographic function
11:07
But if we decided not to support it then we would go to whatever lengths necessary to try to ensure that it would be extensible and we encouraged extension on and We encouraged and encouraged extension ontologies
11:25
Where the people developing those have much more expertise for developing ontologies for special collections and things like that to develop extensions So I mentioned a few of these
11:41
Make sure this comes out. This can be seen because this one's in black everybody see this, right? So there's art objects that's being led by Columbia University Harvard is leading the effort for for extensions in in both maps and moving images
12:01
Perform there's a perform music group That's being led by Stanford that's the one that I have worked probably the closest with and I think I'm not sure this but I think it's probably the furthest along among the Among the extension ontology and somebody can correct me on that
12:21
Rare materials by Cornell and then there's bibliotech Oh, which is Which is loosely described as a bib frame extension, but they're gonna be speaking next so they can they can characterize that Better than I can probably So I want to first give a very quick review of the bib frame model just to
12:42
Just to provide some context for what I'm going to talk about. So the basic bib frame model Begins with a work and a work can have one or more instances. So for example work Candide the book might have a print version published an electronic version. Those would be two instances and
13:08
Every instance can have one or more items and those are the copies of the given instance. So for exit and If we also define work-to-work relationships is an important aspect of bib frame
13:22
so for example the book can deed and the play can deed these are these are two distinct works and They are connected by the property that we've defined BF related to which is basically a super property for for a number of sub-properties
13:42
Sub-properties of BF related to Now I just want to say at this point that the related to property was pretty much conceived for the purpose of Work-to-work relationships, although there were some work to instance relationships instance to item relationships primarily work-to-work relationships
14:07
But the the Extension groups primarily the music extension groups wanted to relate Works to other sorts of things not just instances and items and so in the interest of
14:23
Extensibility we dropped all of the domains and ranges on the Related to property and I'll talk more about that in a moment or two and finally the In the bib frame model we have a number of Subclasses of work so for example a book is a BF work
14:44
but it's also a BF text as BF text is a subclass of BF work a Painting would be a BF still image, which is a subset of BF work and with that I Want to talk just a bit about
15:00
The music extension and how we've related to that So back to the subclasses of work there are two in particular that are of interest to To music BF audio and BF notated music. So let's take for example the Mozart clarinet quintet so that
15:23
you could have a score or you could have a recording of it and The score would be a BF notated music The recording would be a BF audio and these are these are two distinct works
15:43
The the but the music extension adds a lot a layer of LA I've got a few typos in this and I'm going to correct those and I'll issue a Revised version of this. It's not a later of abstraction. It's a layer of abstraction That music adds to the to the basic model
16:03
You won't find this published anywhere But this is my understanding from reading what they're working on that they're coming up with a work model And in that work model, they would actually define an abstract work So in other words this particular piece of music the Mozart clarinet quintet
16:21
Would be would have an abstract work Which is actually the music as it existed in in Mozart's head and then when it was committed to paper it would become become a notated BF would become in the bib frame terms of BF notated music and that's a work But the the actual abstract work would be a layer above that
16:44
So this is an a you know, sort of an extension to the bib frame model that the music Extension is is is developing and as far as bib frames concern I mean my bib frame doesn't define an abstract work of that type as far as bib frame is concerned It's perfectly compatible with the model. It's just sort of extends the model
17:05
and so they and then when they define this Extended I mean this layer of additional layer of abstraction they defined property realized in so this Mozart clarinet quintet would be realized in a
17:23
BF notated music it would also be realized in a BF audio if it's recorded PMO here refers to the perform music ontology and
17:42
Okay, so it all the the music people took a look at the BIB frame event model which I'm going to talk about in just a moment and said that this isn't this relationship Related to isn't good enough. We want to relate works and events
18:01
so here you have in this case an event a performance of the Mozart clarinet quintet is a BF event and it according in music terms. It's a PMO Performance and then you also have the event to work relation Has recording
18:21
So let me talk about bib frame events for just a moment. Um, how much more time? What? Oh Okay, so later late in the game we added an additional core class BF event and So let's say there's a concert. The concert is recorded a book is written about the concert
18:45
Well, the concert is an event it's a BF event the recording of the concert is a work and The book written about the concert is a work and the concert is the subject of the book So let me just say what I mean by subject of the book and just so digress for a moment
19:01
Talk about bib frame subjects when we when we express a subject in between we want to give a type so, you know for example This subject is a is a person and then you give the actual subject This is a bit of hand waving you could that subject could have a direct object as a man's record or what or so forth But this is a little more human expressible
19:22
So let's you express it this is a person the person is John Wilkes Booth as opposed to say a work because it could be BF work and It could be a work. It could be a book about John John Wilkes Booth It could be a BF geographic as I said, it could be a BF work or
19:41
What I'm point that I was coming to is it could be a BF event This is one of the main reasons for for describing I'm getting the choke signal here I will let so Just give me another 30 seconds here
20:01
We the event content and event content of our two properties that were defined For the purpose of for the event model and We've extended the BF related to so that you could have work to event work to event relations
20:22
and the music people created this property created for PMO created for because they didn't think that what I that the that the I Can't go can't go back but that they didn't think that the that the
20:41
related to property was Specific enough. Anyway, they would did they have defined additional PMO classes concert performance and festival and in addition They've sort of developed a whole hierarchy of event types and the rare material Folks have created
21:01
A custodial event an event type and a whole lot of custodial event Types that are subset of custodial events. I have a lot of material that I want to discuss on bib frame titles I'm gonna I I would suggest it go to the the presentation I did yesterday and there's a
21:21
wealth of examples on bib frame titles and So, I guess that's it. Thank you. Okay. Thank you. There's only so much you can There's only so much you can fit into ten minutes So I think we have to move on to the next speaker on any questions. I suggest you take up during the coffee break