Merken

The Front Matters: Capturing Journal Front Matter Content with JATS

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
of the presentation today is titled that's what matters capturing Journal front matter content with Jack I'm in
other words we basically thought since we're all here to talk about tagging articles by we would talk about something that is not an article at all of get like consistently in the chaos category of as a quick disclaimer or kind jargon issue you'll notice that I
use front matter and journal matter pretty much interchangeably today because I like variety of and it's also because many journals no longer have traditional France I have historically the
phrase front matter plywood about this sort of mental image for most of you are the more
and more front matter has become an idea rather than a place an article journal matter can be found in a variety of places under a variety of name before we get too
far into this I'm going to introduce my 2 co-authors on the paper on my voice I was sitting at the 3 of us as the new of presentation kind makes the stomach a Motown trio on 1st there is that racial Carter she has a general manager for provenance essential here at NCBI within an allowance and has a masters in library science from the University of Maryland and Rebecca Mooney also has a masters in library science from Maryland and she was a forager she is a former general manager public here and now she works as a project analyst for the American advance but the American Association for the Advancement of Science which is a mouthful I'm at I'm giving you those brief bios is just the sort of set the stage for how we approach the project there going to be talking about today on which was as individuals who by the nature of our jobs spend a lot of time thinking about the role of PMC is an archive and as librarians and who have a passion for preserving information and making it readily available to as wide an audience as possible so the big picture as we all
know that no for over 10 years now and LM has been a leader in the field of archiving scholarly articles in a way that supports research in the biomedical community as well as the preservation of content for future use answers the growth of digital publishing and archiving there's been a conversation with the academic library communities regarding what is being visually published as well as what should be on I'm stealing this quote from Dr. Markham who an associate Librarian at the Library of Congress who writes decisions must be made about what will actually be saved for future use will content consists only of articles in a journal or will it also include front matter such as the names of the members of the journal's editorial board the
acknowledging this question the National Library of Medicine wanted to investigate options for supporting the archiving and preservation of non article content addressing the issue from within the context of Pub Med central we look at the what why and how the preservation issue as for the what and why we decided we wanted to focus on the front matter as the most significant part of non article content as it helps to find the historical context of the journal and provide a more complete overall picture of the journal's publication as for that how we should preserve this material on why were up here for the rest of the I'm just so all
overview of us and our focus was doing this for from its central on but we still discover how penitentials kind of structured many of you probably already know that it's a full text database here and that therefore methods by which an article can end up in can see on for the purposes of this project were focusing just on the journals who submit material through method
a uh methods the 3 D get a little more complicated and on people get confused by them often so you only have to think about 1 I have a method a is when the journal deposits all NIH-funded final publish articles in PMC 3 made
publicly available within 12 months of publication without the author being involved in bomb basically it is just an exchange between us and the publisher all those things you can put the author the I
and the article submissions I tagged with metadata at the journal and article levels as dictated by Jack on most of you priority now that we can see only renders the table of contents based on the existing article metadata and organize i organizers on biological type the armor and by capturing Journal matter in addition to articles were thinking that PNC can open the door to being a more complete archive and better preserve the environment in which the article is she was published I a so currently
when you can cut that central but you can find front matter through their links on I'm going to make a terrible mistake can actually try to go back and forth between the internet and power point so bear with me but this information is not being archived by PNC nor is the information in the banner specific to the information that was current at the time of a given articles publication rather the journal matter content lives on the publisher's website and always links to the most current version of the front matter which of those in an issue article was published so in this example while another agronomy you can see that this is a journal from 1996 if you click on the banner there you are going to be
taking to the most current version all the website
and so the about the journal is the 1 from 2012 year editorial boards and suffer also current and so on are thinking was that this fails to maintain the historical context of the archived articles that we have the and action make this really easy for that
class OK that set the links in the journal
banner that you just saw would give you a pretty good idea of the non article content we chose to focus on preserving I point this out as the definition of non-Oracle front matter content can be tough to define on as that he said I come from a book publishing background so I came here with a very specific on linear definition of front matter that basically said front matter begins with a title page and continues through any provinces of forwards that may precede the real meat of the work on the looking at this from a PMC perspective where there is already a way to capture article style content and a means by which to render TO sees uh we chose to focus our efforts on the non article content that was being overlooked in going and preserved so this is like the editorial boards the journal mission statement the submission guidelines etc. the I that we decided was out of scope our but advertisements on we saw forwards and prefaces is something no could easily be captured in the article model and probably beyond the scope of this project the set
on we have a formal animated the timeline here I to give you an idea of how the profits into the big picture when the NLM DTD was originally developed its purpose was exclusively to capture article content In 2001 PMC staff created the issue admin DTDs which was made available the publishers who expressed an interest in submitting additional material such as journal covers to the PNC archive while still used today the issue and DTD is based on P M C 1 DTD a bonus points if you know what that is I'm are no longer used in addition the issue admin DTD fails to give the user a means tag from matter and very meaningful wet for those of you who are in attendance last year you may remember at upon presentation about this issue XML which was designed to capture additional issue level make metadata in order to build a functional tables of contents from a store of articles which brings us to today's on wind and this implies that we did in 1 year and that is why I have I can sell anywhere we envisioned a broader and different scope than at upon for this project on the PMC Journal matter DTD is intended as an extension to the more current version of the PNC tagging suite that offers options for Semantic Specificity intact we did some
digging on the web to find comparable projects that might already be out there and save ourselves some time but we didn't find had any real success that I'm in the rare cases where the more archetypal databases such as J. store provide front matter it is typically only as a PDF scanner the 1st few pages of the journal I we felt that the PDF format had some limitations namely it assumes that there is an issue that scan which is not often the case with many online journals the PDF is also good form for updating content would require only scans to be made of the complete font every time a change is made to any of the front matter content and finally the use of PDF documents is limited to availability and availability of certain platforms and technologies to something we've all heard many times so we believe that finding a way to capture this content in XML structure would avoid these pitfalls and allow for easy updating of documents in addition to making the material queriable and usable the on the tide comes as spice no
1 and from that we once we have made the decision to use an excellent xml option you look 1st 2 jets all that's why we're here and we also point would been fired if we know that there's something better I write it in addition that it really suited our needs because its flexible and provides an existing framework for capturing journal content on and again as I said we were working with Indian features pretty just center so i'm
extension why not new status as it is and avoid public speaking entirely I have simply because we don't want to capture this content using the existing jets aka model because it isn't an article I we spent a lot of time trying to come up with the scholarly way of putting their but it's pretty much the crux of the matter of well the existing jets model is flexible there are limitations that would restrict the user's ability to capture non article content for starters just as not provide a means for differentiating between content published as part of an issue and content that isn't issue-specific or for tagging issue-specific metadata at a level higher than article matter as we were not dealing with articles for the problem death also did not provide a way to meaningfully capture the content we were looking at for example there is no way to capture editorial boards in a useful semantically sound way consequently we needed to create a new elements and attributes to reflect from matter more here and
taken the 1st step down the road of addressing how to preserve the contents we determined that the goals were to capture that matters content in the environment in which it was published working as much as possible within the Jets framework we also wanted to create a DTD that would allow for the same versatility in both use and rendering that the item article model provides the the
this slide offers a quick and concise overview of the next steps that we went through in creating the PNC dramaturgy TD are for the most part we approached this project very much like you would expect librarians we went to the library stacks we performed on-line searches determine the way the information can be organized turn the data into something users could use and then tested and kept testing but we kick things off with
this highly scientific method here that you see involving photocopiers and highlighters and we need node of common components present in our sampling and created a list of several different content types that compose front matter other highlighted sections you see for the most part reflect the common break down the journal matter that are typically found in those PMC on journal banners like the 1 we looked at earlier so things like general information contributed information publisher information and editorial boards the
next step was to use this preliminary data to list the different elements that would need to be present for each particular content type that we had identified at the previous stage we generated sample documents of how we anticipated the content types being tagged and that's the example of which you see here the this is something
of this nature samples via tag we were able to see where non-article content diverge from article content and the new elements would need to be created as an extension of Jack combining these new elements with the applicable elements from that we developed the 1st iteration of the PNC Journal matter DTDs I and here is
a sneak preview of what you have to look forward to on it In order to test the early form
of the DTD we tag journal matter from our original highlighted samples after making some adjustments we were able to use the DTD an entity files that I created to generate tagging guidelines and documentation pertaining to the project which
were crucial for the 2nd phase of testing and face to which was to
ask the journal managers at PNC to assess the DTD by tagging unique front matter samples along with the documentation they were provided with both is you a non-issue samples from multiple publishers that covered a variety of content types final adjustments to the DTD and documentation were made based on their user feedback so the culmination of
this whole process brings us to the technical details of the PNC dramatically TD and so we just want take this kind of opportunity to spend the next several minutes introducing the various parts of PMC Journal matter XML document and the framework for tagging its content at 1st of you nuts-and-bolts details the DTD called upon to files from Dax already which the general public custom modules into the file and uh modules into the file and it also consists of 2 files that we created specifically for the project on which I'm not going to show you but you have to take my word for it the TNC Journal matter DTD and the PNC Journal matter custom entity files the the FIL to
start tag a document you have to 1st define the root element and attack tributes from in this case the root element is Journal matter and that required actually it's ah Journal matter type and content types so in this example you see here the journal matter issue it type is you and the content type is Edwards so unsurprisingly you can interpret that to mean that this is an editorial board that is specific to the content the issue of a journal and
endowment actually it's our element was key in resolving 2 of the primary challenges we face while working on the creation of the DTD namely how to generate a foundation for organizing labeling the front matter content and 2nd of all Anthony question of can we or should we take all of this content in 1 document to address
these issues we start regional matter type so after reviewing a data we have determined that is you a non-issue material must be captured and labeled separately because the instances of fact matter content that are not published in issue are for example BMC Cancer which I believe is what we're looking at here publishes articles individually on a rolling basis and without any in issue-specific the information given on yet they still maintain information logo on the front of an issue such as an editorial board and you can kind of see I don't know if this is a clickable 1 ideas particles itself
but so you can see here you
have your standard article it has everything you would expect from an article but it's not in a journal and that kind of complicated things for us right it's Saifuddin initiative so consequently
according to the PNC Journal matter DTD the Journal matter type can neither the issue or standing and standing as the word we decided on as to how to
define non-issue based contents like that that we just looked at In addition offering a way to capture standing matter content this active you prevent a hybrid of issuing non-issue content in the same XML document it also means content that changes in each issue could be more easily updated without dealing with standing documents all depending on the content that is being tagged it's possible for a single
journal the tag some material is standing and some issue all this example I don't know how this that publisher defines it but you could obviously believe that they will change their cover every issue well as with their editorial policies or their submission guidelines that might just be something that sticks around for a few years the 2nd after be on the root element
is content types this allows the user to categorize the nature of the material being captured content I was crucial in helping us meet the Goals we had set out and addressing the challenges we faced it gives users the user flexibility in tagging and rendering gives meaning to the content of the document and organizes information in a way that is valuable to PMC while still reflecting how publishers currently classified this information as it
greatly journal matter down by issuance standing separating front matter into content types allows for selective editing that is a publisher can update specific documents as they change rather than resubmitting all the front matter content it also permits the user to more easily tagger render the content in a way that best suits their needs alternately a user could choose to capture all the front matter in 1 document if that's what it suited their needs but it would result in loss of meaning so here I have a 6
different possible content types and basically the breakdown is general information author information publisher information editorial board cover any other specific other I these content types were developed to be physically and conceptually manageable for the user both in terms of length of the document and the content it may contain but please note that the phantom types are merely suggestions not required so you could read out XML without them but we feel like this allows for adaptability from multiple publishing models in usage of 5 and the the the following the retirement that down the
front front matter documents into 4 main elements of you see here on the full tree Journal matter is she document matter and body assume that is the only 1 of these that is optional because it would exist on only in an issue so if you're using if you're tagging a standing document you wouldn't have journal is comprised of elements that
come from the journal metamodel ingest we want to maintain consistency as much as possible with the article model because the DTD is intended to be an extension and because at this level of metadata Journal matter would be the same of include the same elements for both article and journal front matter this attempt to be consistent with the article model whenever possible is going to be a common refrain in this section of the presentation as it is intended to show our that it's not entirely a new beast so don't be scared of and and the fact that we are
afraid that we will highlight as often as possible by running side-by-side comparisons of the PNC Journal Matt matter DTD which acts arms to show you where there is overlap and evidence for example on the slide you can see how he's a Maytag journal metadata in the PNC Journal matter DTD as it appears in each of them and you can see In the elements captured here are the same in both so all the usual suspects is she marries a new
element that was devised in order to capture the issues specific metadata the contents of which will look vaguely familiar to those who were familiar standing it is very it takes aspects of article the and that we considered issue-specific I always believe that when you are maxima needed to be created all just to move away from the article based model toward a more journal and issue centric structure that could better reflect how front matter relates to the broader publication and it is only to be used with documents that have a dramatic type of issue the example
here shows an instance of an issues metadata and illustrate some of the overlap and with the Jets article matter as you can see highlighted in yellow the publication data captured in both of these examples at this level of although it may not be clear here that all RDT devil offer both the print and electronic publication dates to be captured I don't think are examples of that but it's true take my word for it as his so the the pink apophyses where the DTDs diverge on article Matisse focuses on data relating to the article and we really wanted to keep this issue specific at this level so things like the issue ID issue sponsor issue title predefined here today
the final metadata element is document manner and it contains the elements included in the Jack article metamodel that are not tissue-specific in other words elements pertaining specifically to the document go here such as copyright and licensing as well as title and location of material on either in the journal online on it's worth noting that document titles for front matter may not be super obvious at least not so much as it is with the tagging and article so if you're sitting there thinking of never given my front matter a title don't worry on you can leave this the title element in the and this
is another nifty side-by-side comparisons illustrate the similarities and differences between this element and the Jets article matter at once again you'll see similarities in yellow so the copyright licensing info is included in each example the document that had the document title rather than an article title the
and finally getting away from metadata and moving on to actual content we have a 4th document element which is body on this is the final element of a front matter document a mirrors the Jets article model and that most start matter can be tagged using a basic section paragraph structure for the most part are the 1 exception to this rule and to keep things
exciting here I we created the element person lists to be included in the body to reflect the nature of journal matter content this is a container element useful list of persons such as editors reviewers that are contained within the document that the body are most often says something like an editorial board I rather than use the section paragraph structure here we want to be able to tag names titles and associated information in a way that was meaningful but still relatively familiar to most jets users others from a million
30 stems from the resemblance a person less may bear to the existing jets elements of person group on this slide illustrates the similarities and differences between them and some of you may be thinking actually the person group sounds like the ideal way of capturing names of a group of people on to clarify its not in in type person group revolution that we're trying to leave all as defined person group element did not meet our needs it's described in the tag library at thinking element for 1 or more authors or editors or translators their names any a reference so outside of the body of the document our In other words the person list element was established to account for the fact that jets offered no options for capturing the sort of content in a meaningful way in the body of the document
of a document kennametal person depending on how crazy 1 go and higher editorial board is structured I'm in order to differentiate between the types of editors and capture the content in a meaningful way we generated suggested values I list for a person with a lot the I therefore when tagging a document with the content type of Edward any of these types can be used on the list was determined based on the data we found in our samples and the language was selected to be consistent with the rules suggested for use but the actually person group type into acts another aspect of our extension involves the sex type attrib u this is that when new at tribute from as with Jacek type is intended to identify the semantic content of a section where it is known and this is the way to provide information crossing the structural sections for searching and grouping purposes as you be a better idea of what we're talking about on if you try the articles the that types that are currently given are things like
materials and methods conclusions the usual suspects I that is not required a controlled active you people in general info is the content type of the
document on we have provided this list we found during our research and modeling phase the general general information had on like it's very big name becomes sort of a catch all for all journal related content and we wanted to offer an additional layer on top of the content type attribute to give this content some meaning the suggested sector it's reflect the more standard classifications of front matter we came across during our data modeling although designed to accept any text its value for about practice this actually should only be used if it's 1 of the types here all as for the go is out there you just can't wait to no more our documentation is alive as of like yesterday afternoon come so you'll
see here we have kind of a kind of
provides you with an overview on much more concise than what I just gave you as well as some samples from of how each of the documents might be tags and we have our very own tag library as well
on so you can click around in there it's got a lot of elements and stuff and we needed we were create cutting out too many elements because the minute you cut out mountain Nelson wasn't that I've always wanted but Nethanel in my copyright statement so I we just did get there
it and also as a quick plug for anyone who thought that looked really lovely and we can't take any credit for it alright instead strongly urge you to go to Mars presentation DTD analyzer a tool for analyzing and manipulating DTDs are Audrey were responsible for making it look very nice and now since we're on the topic of rendering you're probably thinking that would be a great time to show you all the cool things you can do and render internal matter as tag without DTD and you would be right unfortunately I we have interrupted any human resources the creating a render and in Idaho Idaho kind government money people clearly if they but will and so there was some possible use cases so we do anticipate there being various options as to how this content could be rendered based on the compartment inflexible nature of the DTD but in the meantime we'd appreciate if you just use your imaginations and imagine something awful I and so now we enter the final phase of the presentation on the reflection or when we look back and ponder all the things we which we had done differently and of and we do wanna take this opportunity to mention of some of the possible limitations we see that might still present themselves due to the nature of the content we're trying to capture I the BNC journal matter dt is so the relatively untested and of aside from what we've done in house cell from the get go also we felt like there we were in slightly uncharted waters due to a lack of an existing model for preserving Journal front matter in x amount we found that there not necessarily a standard structure of front matter that is universal across the publishing industry publishers use different naming conventions the label sections that contain similar content but in other cases front matter sections are organized and displayed in ways that are unique to a specific publication also some publications choose to include content that others minutes on that so as such we created content types in an attempt to account for these variations and create a consistent consistent and meaningful hopeful framework for tagging this material for PMC but but we do recognize that these content types may not be applicable for all future adaptations of the DTD or the specific in specific needs of different publications but we hope that this test use cases outside NCBI present themselves will be able to work to address any unforeseen challenges in order to provide a DTD that will be of little value to the largest possible audience and finally the PNC dramaturgy TD was based on the death article model are on data modeling and the perceived needs of PMC as judged by experiences working here we view this DTD very much of the work in progress and as a means of studying the conversation on how best to capture this material to suit the needs of users both within our organization and in the broader publishing I looking forward and I would love to come that back from today with feedback and essentially even use cases that will allow us to continue to improve the DTD and expanded possible uses the and so thank you and also thank each of these people for
being fantastic on Green is here today and she was 1 of the original all team members she left that we would have been lost without her for several months so that's all
but if you'd like to know more about the link to our lives documentation is on a slider it's also in our references here so
question
without all yet the hi but TC analyzes I just mentioned can has is the mean of the last week I think he was well all just never met you and I think so yes it we agree on so informs the and I was not familiar before the 2 working selling obviously that's a very interesting and maybe a missed but I have a question when you do it with the purpose to be things be found even then not visible in current frame you may want to add the tags that they're are not really there but explaining do the a single that or also held that the fact that that's the text is really not there has been a bad your URI ID pictures on because it is not just to the landscape but you'll calling for new not the girl somewhere near the regional index and still to know what it is we have the degree or have sometimes still isn't all that as if they're actually images so some normal hours will be type was very now that we still have some title maybe image in some image former share either the Disney owner did not really there but explaining 10 years later defined where was that a IMS which is our main sponsors where OK I mean if it if we're talking about images or something of you can provide captions you but also tag alt text for an image on so we're not taking away things that you can already do ingest so you would do it the same way you do that in ingest for the most part on so if you put a picture of you in California on the cover and you want people to know our own and make it searchable OK so so i th that really allow in addition to tagging you know the gift file of the j peg you could provide all text for the image or a caption for the image or everything and hopefully it would be searchable in XML but my my silly but they're have 9 1 pp serious as of sponsors I and that that the young but some of the data sound and service as but when I was added on top and gambled ever movement and they do make sense who when they work to God there's some uh is some things over then years ago all these species even tell me more down to prove that all obsolete members and do not yet I understand additional tags that the person doing the input as above the this clear that these is these are added 4 added value not really present for that we have a lot that that we don't like it and I have moment right bad feeling about this after like recently on all the blogs was like there was this added the 1st great Gatsby add in a old magazine glycolysis also we're totally not capturing that on and so we didn't look at ads within the scope of PMC I'm not saying it's impossible I just don't know our how you would do it and get it but not sorry and that it's not something we looked at it outside the scope of what we did yes very bold only option good and I'm not really involved directly wilderness area but I do know that they certainly in the past on show the publishers to of uh produce the access to stick to the distribution company to country of distribution for the journal so F of journals have been disputed in Britain we have UK adverts for our journals are disputed thing Johnson issued distributed in the US will have addressed property the US audience so if you ever done that rate you would need to add the 2 for multiple the audiences that this is another reason and and you know road is treacherous thank you yes but John Mylapore across the mentioned earlier and then your slides of all the reason not to do PDS was because if you use a modifier sampling or something new small modified Euler after all the almost in also is going to help you with those small modifications no and the historical nature of the changes over time are you using share but as to your 1st question of which I've just forgotten so regarding PDS so when we break it down by content type by feeling was so you change the name of your editor in chief you'd only have 2 redo your editor of the editorial board content type of document all the rest of them if they haven't change they stay the same I as for how we would maintain them within BMC that I feel is part of the rendering here I have a to yeah this is just but animal and 1 of the how you link up which the article goes with which if you were standing issue is easy because if you but if it's a ascending metadata file and that's something we should be able to handle with the publication dates but I'd like to say it's an implementation thing that we haven't worked on it but it seems it seems feasible the if you were to be encoded in RDF and use the publication roles ontology you could include both the context and the dates for reasons of verbal thought about that next year and at that high him to the known to this is really fun work to look at and I remember very long time ago and librarians asking how do we capture this from Italy on the website and so on and I said well as this temporary measure ellipse publishing editorial each you're thinking the editorial board so it's actually an article about a high but this is this is even better solution of those of us who work to publishers know that you can't believe the front matter as it's printed with the journal it's often out of date and and the editorial board doesn't actually mean that that's the people who reviewed the articles in that issue who because of timing problems involve printing and so on so really the problem is that the and for publishers whole we have to get our content management act together for articles and article versions and then this is a whole new area and it would be so much better if we dated everything so that and so on but so the screws could be very exciting is the setting of a ways for publishers to start thinking about some of these problems and get our own act together so thank you thank you and from again but I'm sorry uh this is an agreement on another for and I noticed you mentioned that the table of contents encoding is out of scope so what is your capturing all the other interesting information was examined you're not capturing any information about the table of contents based on you're librarian up by ground and you think that there's a lot of value added there of like multiple language sort editorial annotations in the table of contents absolutely and I don't think it was not to say that it's a terrible idea but the great idea we just did not worry about it during the 1st iteration because it's something PMC takes care of on if it's something we haven't covered yet and you still want to capture I don't think there's any reason why you couldn't put it within the body of a document so that least searchable and hopefully we will eventually get around to creating meaningful tags that go with it so of that is why it is it's outside the current scope but it's something they were very aware of and would like to eventually this is jeff again it's possible we could use the table contents model that just was released yesterday in bits of was also think about the it's only we don't even know about it you I pod did the parable very technologies and it's possible that work you do you could see that new wall that model is brand new take a look at it from your point of view and see what else you need yeah we'll see around contact us slide is just the general PMC e-mail address that you will be here so find
as you want and I think you might be a little faster get off I don't remember
clearly bodies and the overlap between which just talking about the issue XML it was so discussed last year so very you I mean only in that we're looking to capture things that an issue level that I will I don't wanna miss speak about how they were looking for ways to capture metadata to on to put together various tables of contents all Mars was more to just capture of the content of the front matter and so In the aspect of there is some overlap with the issue level but what we're trying to capture is slightly different he it sounds like it is but last year we talked a lot about functional issue metadata and now we're talking about capturing the content that the jets is that it's ignoring what's the jets is not built to to and and them and 1 thing I'll say about this group is that this project came up hold true 3 years ago and I wasn't particularly excited about it so we started really is a training as a training project and and it's I think sponsored thing very useful so thank you in an overkill you Finland and did assure model include anything further dates that would include their indicate that this piece of journal metadata is valid from this state to the state so that if an editorial board changes is a fun matter captures when editorial for board is in effect and when it's different now but we should it a federally get suggestion but no nothing currently exist in that thank you yes they are movement and another for multiple from when you try to cop to general issue the data into different structure of a this information or it exists in in that specific uh article met the elements in kinds out of documents so what is your idea about that do you plan to replicate that information in this or that the the the or do you plan to take them out from the outcome might then just maintain at the general initial level well we're not dealing with article matter at all but we've gotten rid of it I and I think that are in article matter now split over is she met and document matter our article men I'm sure would it Help us on our and figure out which article should go with which front matter but you know and what the yeah I mean were just not looking to capture article murder so I don't I don't really know how to address that and sex all the information we still remain in optical met the general level information initially information and you will also replicate that into your stock to use of woods write articles would still have all of the usual metadata and this would be an entirely different document and as it would imply applied sometimes do more than 1 document it would have some slightly different levels of metadata and no article metadata it when the peace peace consulting on guy non-hearing esteem in many of the the sort of suggestions and and and push back that that we were hearing here on which I think is actually pretty interesting and arm is worth pointing out because Thomas I extend it and what you're presenting in terms of modelling is driven by the requirement to capture information that's already been given to you in in in dramas that were coming question desks and you know what will you haven't yet really focused on on the on the 1 hand the production of the drop publishing system which is about 4 temples some deceptive talking but you're in terms of you know disambiguation or removing redundancy will maybe you're not in this regression in that and then on the other hand there are certain common aspects of the representation that information which you decided to have a scope because there maybe not for version 1 right like aspect to play right and the uh I think that's really I think it's really important that you know and I think you've got a pretty good job at that to work at the presenting the boundaries of the problem your data presented here but I also think it's really interesting that forcing that happens we present this is we also all the way but this 1 but that will notify literature we should do it the way you put push those boundaries right now what what I'd like to suggest is that actually is an indication you're doing a very good job because if it if it if you weren't doing what you set out to do we be quarreling but that instead work were arguing about well water of what are the things that haven't been done yet that could be adapted with lead into what you're really doing things this is particularly difficult the work and its and it was did you have the fact that you ask users from questions like this is the fact that it has to be this is the that's a that's a limitation we're very aware of for the whole section Katie valid so it that it was the the slide it around buying items missing from the inferences on 1st I was a little like the multi they know I think it said great job and actually to respond to cases question Ms. partially our perhaps I I you think that the issue never metadata there's the that's repeated within each article and that the higher issue that soon have repeating that information within this model and within the article content is useful when it lets you kind of cross checking and validation the x now not for consistency checks and I would also like to add can make a 1st marriage to add that he she met the element the jets isn't that is a very logical thing I should be often thinking of I do not in control over that it is good to me at lunch time and the and
Kategorie <Mathematik>
Unordnung
Vorlesung/Konferenz
Wort <Informatik>
Programmierumgebung
Inhalt <Mathematik>
Kombinatorische Gruppentheorie
Computeranimation
Kombinatorische Gruppentheorie
Varietät <Mathematik>
Natürliche Zahl
Feldgleichung
Extrempunkt
Kombinatorische Gruppentheorie
Quick-Sort
Packprogramm
Computeranimation
Datenmanagement
Prozess <Informatik>
Programmbibliothek
Projektive Ebene
Information
Bildgebendes Verfahren
Varietät <Mathematik>
Prinzip der gleichmäßigen Beschränktheit
Umsetzung <Informatik>
Kontextbezogenes System
Whiteboard
Packprogramm
Computeranimation
Entscheidungstheorie
Konfiguration <Informatik>
Datenfeld
Whiteboard
Digitalisierer
Mereologie
Programmbibliothek
Inhalt <Mathematik>
Datenhaltung
Vorlesung/Konferenz
Projektive Ebene
Datenstruktur
Fokalpunkt
Computeranimation
Autorisierung
Addition
Typentheorie
Texteditor
Selbst organisierendes System
Übergang
Vektorpotenzial
Ausgleichsrechnung
Packprogramm
Computeranimation
Übergang
Metadaten
Datenstruktur
Inhalt <Mathematik>
Programmierumgebung
Tabelle <Informatik>
Umwandlungsenthalpie
Web Site
Punkt
Verschlingung
Versionsverwaltung
Inhalt <Mathematik>
Information
Binder <Informatik>
Computeranimation
Internetworking
Leistung <Physik>
Tabelle <Informatik>
Befehl <Informatik>
Gruppenoperation
Klasse <Mathematik>
Kontextbezogenes System
Binder <Informatik>
Whiteboard
Computeranimation
Linearisierung
Homepage
Motion Capturing
Arithmetisches Mittel
Informationsmodellierung
Whiteboard
Menge
Verschlingung
Perspektive
Projektive Ebene
Inhalt <Mathematik>
Punkt
Stab
Mathematisierung
Versionsverwaltung
DTD
Kombinatorische Gruppentheorie
Systemplattform
Computeranimation
Homepage
Übergang
Formale Semantik
Metadaten
Bildschirmmaske
Benutzerbeteiligung
Font
Inverser Limes
Inhalt <Mathematik>
Datenstruktur
Maßerweiterung
Speicher <Informatik>
Formale Grammatik
Umwandlungsenthalpie
Addition
Suite <Programmpaket>
Lineares Funktional
Softwareentwickler
Dichte <Stochastik>
Datenhaltung
Systemverwaltung
DTD
Dichte <Stochastik>
Online-Medien
Packprogramm
Konfiguration <Informatik>
Motion Capturing
Arithmetisches Mittel
Inhalt <Mathematik>
Projektive Ebene
Ordnung <Mathematik>
Tabelle <Informatik>
Addition
Element <Mathematik>
Übergang
Sprachsynthese
Maßerweiterung
Element <Mathematik>
Framework <Informatik>
Whiteboard
Computeranimation
Entscheidungstheorie
Konfiguration <Informatik>
Übergang
Arithmetisches Mittel
Motion Capturing
Metadaten
Inhalt <Mathematik>
Informationsmodellierung
Whiteboard
Mereologie
Inverser Limes
Inhalt <Mathematik>
Maßerweiterung
Attributierte Grammatik
Softwaretest
Motion Capturing
Element <Mathematik>
DTD
DTD
Keller <Informatik>
Framework <Informatik>
Computeranimation
Rechenschieber
Informationsmodellierung
Typentheorie
Mereologie
Programmbibliothek
Projektive Ebene
Inhalt <Mathematik>
Information
Programmierumgebung
Typentheorie
Subtraktion
Mailing-Liste
Element <Mathematik>
Kombinatorische Gruppentheorie
Whiteboard
Computeranimation
Knotenmenge
Mereologie
Stichprobenumfang
Datentyp
Ablöseblase
Kontrollstruktur
Zusammenhängender Graph
Garbentheorie
Information
Inhalt <Mathematik>
Chi-Quadrat-Verteilung
Element <Mathematik>
Natürliche Zahl
Datenmodell
DTD
Iteration
DTD
Element <Mathematik>
Computeranimation
Bildschirmmaske
Stichprobenumfang
Projektive Ebene
Inhalt <Mathematik>
Maßerweiterung
Ordnung <Mathematik>
Chi-Quadrat-Verteilung
Softwaretest
Rückkopplung
Eindeutigkeit
DTD
DTD
Elektronische Publikation
Modul
Framework <Informatik>
Computeranimation
Multiplikation
Softwaretest
Datenmanagement
Datentyp
Stichprobenumfang
Mereologie
Vorlesung/Konferenz
Projektive Ebene
Wort <Informatik>
Inhalt <Mathematik>
Sollkonzept
Phasenumwandlung
Varietät <Mathematik>
Inhalt <Mathematik>
Typentheorie
Element <Mathematik>
Adressraum
Datentyp
Booten
Element <Mathematik>
Wurzel <Mathematik>
Inhalt <Mathematik>
Whiteboard
Computeranimation
Typentheorie
Element <Mathematik>
Datensicherung
Whiteboard
Dialekt
Computeranimation
Attributierte Grammatik
Partikelsystem
Inhalt <Mathematik>
Information
Booten
Chi-Quadrat-Verteilung
Instantiierung
Addition
Typentheorie
Mathematisierung
DTD
Mathematisierung
Hybridrechner
Computeranimation
Motion Capturing
Vorlesung/Konferenz
Wort <Informatik>
Inhalt <Mathematik>
Hybridrechner
Chi-Quadrat-Verteilung
Standardabweichung
Gruppe <Mathematik>
Element <Mathematik>
Selbst organisierendes System
Natürliche Zahl
Element <Mathematik>
Überlagerung <Mathematik>
Computeranimation
Überlagerung <Mathematik>
Chatten <Kommunikation>
Datentyp
Wurzel <Mathematik>
Information
Inhalt <Mathematik>
Booten
Inklusion <Mathematik>
Magnetbandlaufwerk
Autorisierung
Suite <Programmpaket>
Einfügungsdämpfung
Subtraktion
Dicke
Gruppe <Mathematik>
Term
Whiteboard
Computeranimation
Inhalt <Mathematik>
Multiplikation
Whiteboard
Anpassung <Mathematik>
Datentyp
Inhalt <Mathematik>
Information
Sollkonzept
Element <Mathematik>
Element <Mathematik>
Binärbaum
Kombinatorische Gruppentheorie
TLS
Computeranimation
Übergang
Metadaten
Informationsmodellierung
Gruppenkeim
Metamodell
Garbentheorie
Maßerweiterung
Widerspruchsfreiheit
Meta-Tag
Umwandlungsenthalpie
Typentheorie
DTD
Paarvergleich
Element <Mathematik>
Computeranimation
Rechenschieber
Metadaten
Informationsmodellierung
Inhalt <Mathematik>
Ordnung <Mathematik>
Datenstruktur
Meta-Tag
Umwandlungsenthalpie
Datenmissbrauch
Machsches Prinzip
Hochdruck
Datenmodell
Desktop-Publishing
Element <Mathematik>
Computeranimation
Übergang
Metadaten
Hardware-in-the-loop
Metamodell
Verschlingung
Wort <Informatik>
URL
Meta-Tag
Instantiierung
Inklusion <Mathematik>
Subtraktion
Ähnlichkeitsgeometrie
Ausnahmebehandlung
Schlussregel
Paarvergleich
Element <Mathematik>
Computeranimation
Metadaten
Informationsmodellierung
Verzeichnisdienst
Vier
Zellularer Automat
Mereologie
Garbentheorie
Information
Inhalt <Mathematik>
Datenstruktur
Autorisierung
Subtraktion
Typentheorie
Jackson-Methode
Element <Mathematik>
Natürliche Zahl
Gruppenkeim
Ähnlichkeitsgeometrie
Mailing-Liste
Element <Mathematik>
Quick-Sort
Whiteboard
Computeranimation
Konfiguration <Informatik>
Rechenschieber
Texteditor
Rotationsfläche
Translation <Mathematik>
Wort <Informatik>
Garbentheorie
Addition
Inhalt <Mathematik>
Information
Datenstruktur
Typentheorie
Mereologie
Euler-Winkel
Materialisation <Physik>
Formale Sprache
Gruppenkeim
Mailing-Liste
Schlussregel
Whiteboard
Computeranimation
Texteditor
Whiteboard
Gamecontroller
Typentheorie
Stichprobenumfang
Datentyp
Vorlesung/Konferenz
Garbentheorie
Information
Inhalt <Mathematik>
Maßerweiterung
Ordnung <Mathematik>
Versionsverwaltung
Addition
Typentheorie
Datenmodell
DTD
Mailing-Liste
Quick-Sort
Computeranimation
Mailing-Liste
Typentheorie
Datentyp
Inhalt <Mathematik>
Information
Sollkonzept
Phasenumwandlung
Attributierte Grammatik
Standardabweichung
Rückkopplung
Subtraktion
Umsetzung <Informatik>
Spiegelung <Mathematik>
Natürliche Zahl
Wasserdampftafel
DTD
Zellularer Automat
Element <Mathematik>
Kombinatorische Gruppentheorie
Bildschirmfenster
Framework <Informatik>
Computeranimation
Metropolitan area network
Informationsmodellierung
Arithmetische Folge
Typentheorie
Datentyp
Volumenvisualisierung
Inverser Limes
Inhalt <Mathematik>
Datenstruktur
Inklusion <Mathematik>
Phasenumwandlung
Umwandlungsenthalpie
Softwaretest
Befehl <Informatik>
Motion Capturing
Jackson-Methode
Datenmodell
Datenmodell
DTD
Ähnlichkeitsgeometrie
Konfiguration <Informatik>
Arithmetisches Mittel
Anpassung <Mathematik>
Garbentheorie
Ordnung <Mathematik>
Standardabweichung
Distributionstheorie
Typentheorie
Bit
Punkt
Momentenproblem
Web log
Gemeinsamer Speicher
Natürliche Zahl
Formale Sprache
Adressraum
Hochdruck
Versionsverwaltung
Iteration
Computeranimation
Metropolitan area network
Metadaten
Datenmanagement
Prozessfähigkeit <Qualitätsmanagement>
Vorlesung/Konferenz
E-Mail
Chi-Quadrat-Verteilung
Einflussgröße
Addition
Sichtenkonzept
Kategorie <Mathematik>
Güte der Anpassung
Strömungsrichtung
Bitrate
Kontextbezogenes System
Konfiguration <Informatik>
Rechenschieber
Arithmetisches Mittel
Texteditor
Dienst <Informatik>
Menge
Automatische Indexierung
Grundsätze ordnungsmäßiger Datenverarbeitung
Ellipse
Information
Tabelle <Informatik>
Web Site
Rahmenproblem
Mathematisierung
Implementierung
Whiteboard
Überlagerung <Mathematik>
Informationsmodellierung
Multiplikation
Konstante
Stichprobenumfang
Inhalt <Mathematik>
Bildgebendes Verfahren
ART-Netz
Gammafunktion
DTD
Automatische Differentiation
Binder <Informatik>
Elektronische Publikation
Quick-Sort
Flächeninhalt
Mereologie
Subtraktion
Wellenpaket
Inferenz <Künstliche Intelligenz>
Wasserdampftafel
Relationentheorie
Selbstrepräsentation
Mathematisierung
Gruppenkeim
Versionsverwaltung
Keller <Informatik>
Element <Mathematik>
Term
Whiteboard
Computeranimation
Übergang
Metadaten
Multiplikation
Informationsmodellierung
Prozess <Informatik>
Lineare Regression
Konstante
Inverser Limes
Vorlesung/Konferenz
Indexberechnung
Inhalt <Mathematik>
Datenstruktur
Tropfen
Figurierte Zahl
Widerspruchsfreiheit
Soundverarbeitung
Lineares Funktional
Validität
Physikalisches System
Biprodukt
Quick-Sort
Rechenschieber
Motion Capturing
Randwert
Rechter Winkel
Garbentheorie
Projektive Ebene
Information
Sollkonzept
Aggregatzustand
Tabelle <Informatik>

Metadaten

Formale Metadaten

Titel The Front Matters: Capturing Journal Front Matter Content with JATS
Serientitel JATS-Con 2012
Teil 04
Anzahl der Teile 16
Autor Carter, Rachael
Funk, Kathryn
Mooney, Rebecca
Lizenz CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/30584
Herausgeber River Valley TV
Erscheinungsjahr 2016
Sprache Englisch
Produktionsjahr 2012
Produktionsort Washington, D.C.

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract PMC strives to be a comprehensive archive of biomedical journals. Currently JATS provides no way to capture journal and issue specific front matter content, such as editorial boards, journal philosophy, submission instructions, etc. We developed an extension to the current Tag Suite, the pmc-journalmatter.dtd, that captures these journal artifacts. In mapping multiple publishing models, we found that front matter exists in two basic forms: issue and standing. The issue attribute should be used for administrative materials that are published in an issue. The standing attribute should be used for non-issue based and administrative materials that are not published in an issue, but are static, such as information published on a website. Our DTD aims to be flexible enough to accommodate a variety of user needs. It allows the users to update different elements of journal matter without the burden of updating all elements by offering separate document types.The document types include: author information, issue cover, editorial board, publisher information, and general information. In order to encapsulate front matter content, we introduced new elements that work in conjunction with JATS.This paper will explore the need for a front matter specific JATS extension, the limitations of the article model to represent this type of information, our process and rationale for the data mapping and subsequent development of new elements, and future implementation.

Zugehöriges Material

Ähnliche Filme

Loading...