JATS and the Standards Ecosystem
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 8 | |
Number of Parts | 16 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/21799 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Washington, D.C. |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
JATS-Con 20138 / 16
00:00
Goodness of fitInternetworkingMultiplication signPoint (geometry)Computer animationMeeting/Interview
00:19
Centralizer and normalizerContent (media)Markup languageView (database)Document Type DefinitionMultiplication signTransmissionskoeffizientFile archiverMetadataExecution unitPoint (geometry)Object (grammar)Physical systemMathematicsForm (programming)CodePresentation of a groupPerturbation theoryProduct (business)Distribution (mathematics)Peer-to-peerBit2 (number)File formatComputer animation
01:45
Projective planeNumberElectronic mailing listMeeting/Interview
02:02
Materialization (paranormal)Projective planeConnected spaceSlide ruleFamilyDirection (geometry)MereologyComputer animation
02:31
InformationMaterialization (paranormal)Presentation of a groupCodeNumberCentralizer and normalizerEntire functionTrailIntrusion detection systemTerm (mathematics)WhiteboardAuthorizationFormal verificationVolume (thermodynamics)Website2 (number)Group actionKey (cryptography)Set (mathematics)IdentifiabilityAttribute grammarRevision controlProjective planeInteractive televisionComputer fileData transmissionCategory of beingMetadataElektronische ZeitschriftSystem identificationProof theoryContent (media)Type theoryLevel (video gaming)Product (business)Point (geometry)Element (mathematics)MereologyIterationVideo gameWordWeb pageProcess (computing)Power (physics)Cycle (graph theory)Degree (graph theory)Office suiteBasis <Mathematik>Computer-assisted translationOrbitObject (grammar)Meeting/InterviewProgram flowchart
07:10
MetadataTrailInformationAreaProbability density functionContent (media)QuicksortProcess (computing)Video gameElectronic mailing listKey (cryptography)Meeting/InterviewComputer animation
08:25
MetadataOpen setMarkup languageService (economics)InformationSuite (music)Probability density functionSocial classContent (media)Type theoryMeeting/Interview
08:53
ConsistencyData miningAreaMetadataEntire functionMereology2 (number)AbstractionLattice (order)Point (geometry)Element (mathematics)Nichtlineares GleichungssystemTerm (mathematics)Group actionInformationPhysicalismComputer fileMultiplication signProduct (business)Transformation (genetics)Control flowSuite (music)AdditionArithmetic meanMathematicsGreatest elementVirtual machineSet (mathematics)Design by contractNumberElectronic mailing listSource codePresentation of a groupContent (media)Centralizer and normalizerFront and back endsMoment (mathematics)Intrusion detection systemEndliche ModelltheorieContext awarenessVisualization (computer graphics)Physical systemDigital photographyWindows RegistryVideo gameCycle (graph theory)Traffic reportingService (economics)Computer animation
13:45
Data miningProjective planeMetadataService (economics)Term (mathematics)InformationProduct (business)MereologyWeb pageDependent and independent variablesOpen setLink (knot theory)CASE <Informatik>Greatest elementElement (mathematics)State of matterContent (media)Centralizer and normalizerCirclePrice indexSelf-organizationControl flowVideo gameINTEGRALQuicksortSpacetimePoint (geometry)Speech synthesisGraph (mathematics)Message passingDigital object identifierSet (mathematics)Graph (mathematics)Data integrityPosition operatorDirection (geometry)Meeting/InterviewComputer animation
17:39
Meeting/Interview
Transcript: English(auto-generated)
00:00
Thank you very much, Laura. Good afternoon, everyone. Maybe we'll have a kind Grand Chancellor sparring this year than you've had in the past. I want to talk just for about 15 minutes about JATS and the standards ecosystem. When a few of us sat down at this point a dozen years ago to start developing JATS, it
00:20
was really its own little island. We had basically two missions that we had to fulfill. The first was creating a better DTD for PubMed Central, and the second was creating a DTD that would work for an archive that ended up being located with Portico. Over time, however, even though JATS
00:40
is built on standards like XML and Unicode and MathML, JATS itself wasn't completely an island. It's become much less of an island because it works within an ecosystem, particularly for the metadata. So where JATS started out as a markup solution,
01:00
what it's morphed into is what I call now a workflow solution, that by integrating not just the full text content but rich metadata, what JATS facilitates is, first of all, the metadata transmittal of a submission. So I think most of the major online submission and peer review systems at this point
01:20
can transmit the metadata to you in a JATS or JATS-like format. It facilitates end-to-end production activities. Think about the presentation from Sheridan this morning where they said, we do everything in between peer review and online hosting. And so JATS is facilitating that whole intermediate workflow.
01:41
And finally, it facilitates distribution of metadata to your customers. So what's happened is JATS and BITS also have become central to entire workflows. So what I wanna do is briefly touch on a number of standards and projects, some of which that directly interact with JATS and others on this list are things
02:02
that if you're involved with JATS, you're involved with journal publishing, you should know about even though there's not necessarily a direct connection. And I'm just gonna go through these all fairly quickly and my slides which will be posted do have the URLs for them. The first is the Supplemental Journal Article Materials Project.
02:21
This is a joint NISO Enface project that published a recommended practice last year. Quick show of hands, how many of you published supplemental materials? Okay, I'm seeing at least a third of the hands, maybe half the hands in the room go up. If you publish supplemental materials, you need to read this. There's a lot of really valuable information.
02:41
In terms of how it integrates with JATS, I recommend that you be here tomorrow morning promptly. I think it's at nine o'clock. It's the first presentation tomorrow. We're gonna have Karen and Kimberly talking about integrating this recommended practice into JATS. Second one is ORCID.
03:01
ORCID has really taken off in the last year. This is to create a persistent and unique identifier for every author. So you know John Smith number one from John Smith number two from John Smith number 10,023. This is becoming increasingly important
03:20
in a research world where particularly funders wanna be able to track the authors that are involved in grants from the point of submission to the grant to all of the publications that are published under that grant. And by being able to track the authors with ORCIDs, that's one of the things they can do. For those of you who haven't upgraded to JATS 1.0,
03:41
ORCID, I'm finding with the people that I work with on a regular basis, this is one of the biggest reasons for upgrading to 1.0 because the contrib ID element is in there. It's not just for ORCID. There's a type attribute so that you can say which kind of an author ID it is, whether it's ORCID or anything else.
04:00
It might be an ISNI, for example. You can specify that. And in fact, just within the last hour, I saw an informal request to the working group that a verified attribute be added to contrib ID because with ORCID, there's an important concept of verification. This is a picture from an ORCID presentation last fall,
04:24
a workshop that was in Washington. You'll notice ORCID IDs popping up all over the place. The one that I regret not showing up here, maybe there isn't a laser pointer here, on the XML to Crossref. There's no ORCID ID bubble on that, and that's a critical part of this
04:42
because Crossref really wants to have the ORCID IDs in the deposits there. And so the idea, sorry, the idea behind the ID is that it would be added in at the submission stage. It carries through into your JATS XML in production and then is available for any
05:00
of the metadata transmissions that you do. PyJ, this is probably one of the lesser known standards that I want to talk about or projects. This is about presentation and identification of e-journals online. This is a NISO project. It's a recommended practice that's actually been published, and this basically says if you're putting up a website
05:22
with your journal, there's some critical information you want to make sure you put on that website. This is just one picture from the PyJ recommended practice. This particular publisher has been very careful about putting up their ISSN, the full title, the abbreviated title, the code in, and so on.
05:41
They don't have a title history, but there are other pictures I could have put in where you can click a link and get the entire title history. You can very quickly get to the editorial board. I don't know how many of you have ever gone to the website of a journal and just tried to find the ISSN of that journal because you might want to know it. It's shocking how many journals don't, in fact,
06:00
have that information easily accessible. So if you're putting your content online, I do recommend that you take a look at PyJ. The interaction with JATS here is most of this is metadata that you may well want in your JATS files. Not all of it, you wouldn't necessarily want the title history, but it is something to think about in terms of your overall metadata workflow.
06:20
This is an older project, the Journal Article Version Project. Again, it's an ISO project. This was to develop a set of recommended terms that could be applied to iterations of an article's life cycle. The reason I put this in is because that's undergoing some addendum work now to set up a set of better terms for articles that are in the proof category.
06:44
Crossref and JATS, there's a lot going on in terms of interaction between Crossref and JATS. When Crossref started, the basic idea was that for any given article, you wanted to provide Crossref the key metadata, author's title, volume issue, page, and the DOI and the URL,
07:02
and that was pretty much where it stopped. Crossref has grown tremendously. It's become this incredible central hub for publishers for all kinds of cross-publisher initiatives. So the first one I want to mention is Crossmark. Again, this is one that doesn't have to do as much with JATS, but I want to highlight it because one of the areas where, unfortunately,
07:23
we're seeing a lot growing in sort of this push to publish rapidly, we're seeing more and more corrections and retractions. If anyone's ever looked at Ivan Oransky's Retraction Watch, that is a publication that is growing, not shrinking. If your content is indexed in PubMed, it's great
07:41
because they do a phenomenal job of keeping track of corrections and retractions. But if you're not in life sciences, you're not in PubMed, there is no central clearinghouse for being able to electronically keep track of retractions, where, for example, you could query a reference and say, is this reference in this list something that's been retracted? Well, with Crossmark, we now have the ability to do that.
08:03
One of the key things, however, with Crossmark is you've got to get the metadata into Crossref. And a lot of people have the impression that, okay, I'm just going to do this going forward, I'm only going to put the Crossmark information in my PDF going forward, but guess what? You can put your metadata into Crossref even if you don't update the old PDFs.
08:22
And this is important because what I'd personally love to see a lot of people do is go back and put their critical correction information going back as far as you can into Crossref, even if you don't update your PDFs, because then people can very quickly find out information about published content. One quick piece of breaking news on Crossmark,
08:41
up till now there has not been a standard vocabulary for correction types, that is something that Crossref is now working on, so keep your eyes open for that. Crossref Metadata Services is a suite of tools that allow metadata mining through Crossref. There are a couple of JATs implications for this
09:01
in terms of what you can now deposit to Crossref. So remember a moment ago I listed just a few key elements that Crossref was originally set up to handle and deposit. Now there's a whole suite of additional elements including, first of all, the Crossref deposit schema now supports MathML. So particularly if you're in engineering or physics
09:20
and you have equations as part of article titles, you can now deposit those with the MathML. The second thing is the Crossref deposit schema now supports the entire JATs abstract model. So what that means is you can actually deposit the abstracts of the articles that you publish into Crossref, which means that people can then text and data mine through the Crossref Metadata Services.
09:43
So that's one more way that you can expose your content to people throughout the world. FundRef is an initiative that's taking off very quickly with Crossref. I talked before with ORCID how a lot of funders are more interested than ever in tracking
10:01
the lifecycle of the grant to the publication. And so one of the things that funders would like to see is the metadata for all of the funding information, particularly the funding agency and the contract number tied together. Now the NLM DTD years ago did have support for this. One of the changes in going from 2.3 to 3.0
10:23
and then into JATs was a rework of that funding model, so it's actually much richer and more robust. And that can play really well with Crossref in terms of depositing information to Crossref. If you are looking at JATs 1.1 D1, one of the newest elements is institution wrap.
10:42
And as part of that funding information, you can now have institution IDs along with an institution name inside of the institution wrap element. And that can be used both within the affiliation, if you just want to mark the institutions within an affiliation, or more importantly in this context, it can be used within funding source
11:01
so that you can uniquely identify the institutions. Now Crossref does have a list of recognized funding agencies. This is a list that they got from Elsevier. It's got about 4,000 during the initial incarnation, about 4,000 agencies. Let me skip ahead for a moment.
11:20
That list is not complete, as you can imagine. There are more than 4,000 funding agencies in the world. These are some, excuse me, acknowledgements that I pulled from articles on PubMed Central last December. Anything in green here was an agency that is actually recognized within the Crossref set. Orange means it's close but not an exact match,
11:41
and red means it wasn't actually in the Crossref data. So if there's an agency that is not in that data that's listed within your article content, you can report it in at the fundrefregistry.html. But just quickly to show you an idea here, this is the fundref workflow, and the key thing if you look down
12:00
between the three and six on the bottom is publisher production systems. The idea is integrating this metadata in, which means getting this metadata, oftentimes from a submission system, into your JATS XML, and then being able to push it out on the back end. License metadata is another area
12:22
that's a hot topic these days. I wanna make sure I finish on time so you get your full break, because the next paper after the break is one that I heard at the JATS user group meeting that we held last October in lieu of JATS-CON. It's a great paper. It really made some incredibly important points about inconsistent XML,
12:42
and one of the areas that it really honed in on is inconsistent license metadata. This is critically important to make the content machine-readable. It's no longer, as was pointed out in a couple of other presentations today, it's no longer about having human-readable XML. It's about having XML that machines can act on.
13:02
So I recommend that you come back promptly after the break for Daniel's, it's Daniel, right? Yes, thank you, for Daniel's paper. He's waving his hand up there. Crossref has added an element, again, to their deposit schema for license ref information where you can essentially put
13:20
Creative Commons license information into your crossref deposit, just as you can put that Creative Commons license information into your JATS XML file. So again, you can go from A, JATS XML, to B, crossref deposit, very simply and easily with an XSLT transform.
13:40
And so this is something you should be aware of. And what this means is if you put that correct information in there and you combine that with the crossref metadata services, you can then start doing automatic mining of what's freely available under what kinds of terms. This all plays into another active project that's going on with NYISO called the Open Access Metadata and Indicators Project
14:03
because if you think about it, we have Creative Commons which states in a very clear fashion what the licensing terms are, but what it doesn't state is what the open access availability would be. And so the goal of this project is to harmonize all of this.
14:21
And the implication for JATS, and this is one of these sort of stay tuned to this space or stay tuned to this channel messages, is there will probably be some best practices coming out about how to use the license and open access elements that are in JATS. So stay tuned to this. I'm not sure exactly where this is gonna land, but the people who are working on this project
14:41
are actively talking to the JATS working group and so we are trying to make sure that we all stay on the same page. How many of you know what CORUS is? Okay, about a third of the hands in the room. For those of you who don't know, CORUS is a response to the broader US government mandate that not just life sciences content that is NIH funded,
15:03
but all government funded research be freely accessible to the public. However, unlike PubMed Central where the government built infrastructure for this, and NIH has had the resources to do it, a lot of agencies through the government don't have the resources to be able to build
15:21
this kind of infrastructure. And so CORUS is an arrangement organized by publishers to be able to expose the open access information much more easily and make it discoverable on the publisher's websites. What's really neat is this works based on existing infrastructure.
15:40
So what we've got in the upper left hand corner is the publisher manuscript workflow. This is your production workflow. And then the last circle that got filled in just below that is the metadata deposits. And what's this relying on in the metadata deposits? Fund draft, cross mark, ORCID, license ref, and ultimately having DOI links
16:02
for all of this down at the bottom. If all of these pieces of infrastructure are put together, CORUS can work. So the critical part of pulling together CORUS or any other major initiative where it's an end to end workflow these days comes down to what are the next steps you're going to do as publishers.
16:22
So my takeaway recommendations are first of all, review some of these standards that I've talked about if you aren't familiar with them already, and think about what those metadata requirements are. Then what you want to do is figure out your workflow for the best metadata integration path. That in and of itself is a totally separate talk.
16:42
It's a non-trivial set of issues, but you want to think carefully about how you want to do that and consider as part of this the upgrade to JATS 1.1D1. Unless you're NLM 2.3 or earlier, the upgrade should be fairly painless because everything is pretty much backwards compatible. In implementing a revised metadata workflow,
17:01
think really carefully about what your QA steps are. Metadata QA is probably the single most important, but in some cases the most overlooked part of quality assurance of X amount. So you want to think carefully about how you integrate that and ultimately what you want to do is provide rich metadata to Crossref,
17:20
to PubMed or PubMed Central, to any of the OA funders or expose what they want exposed or to any other organizations downstream that are consuming your content, some of which we may not have even thought of. Again, you're going to hear about one of those after the break. So thank you very much and I'm happy to take any questions.
17:47
Okay, this is not a question heavy audience today.
Recommendations
Series of 16 media
Series of 16 media