Typesetting XML with ConTeXt
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 39 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/51378 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache | ||
Produzent |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
2
3
9
10
14
17
22
23
26
39
00:00
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:10
Thank you, hello everyone, I'm Dennis, I'm from the University of Bern, I'm a subject librarian there and I'm currently in the process of setting up a new journal where
00:23
we use context for typesetting XML and I will just give you a brief overview of how this works. Just to let you know, I have a couple of code examples and I'm not sure if you can read them in the back so feel free to come closer if there's need.
00:41
Okay, the context is this, we use a print journal, Eudaica, that we've had before and we convert it to an E-only open access journal that we host in our Bern University Publishing OJS platform, I've talked about it yesterday in the lightning talk briefly and the task is we want a single source workflow, we want different
01:03
output formats, PDF, XML, HTML, chats as our production format, so we need chats to go to all these different output formats, we want a high quality typeset PDF but no manual typesetting. The requirement is also to have PDF-A and
01:22
no costly software which is, it's supposed to be free as in beer and in speech so it must be open source and reusable, so to say. So for the HTML and XML we have LensView, XLT software but the big question was how do we do
01:44
with the PDF? How do we get from XML to PDF meeting all these requirements? So that's a workflow I've showed you yesterday and now we're concentrating on this last step here, going from the chats to PDF. So let's meet Context.
02:07
Context is a macro package based on TAC, some of you might know it, TAC is a market-based plain text typesetting system initially developed by Donald Knut in the late 70s, 80s, so it's kind of a dinosaur. Early on some people
02:29
decided that this package is not really usable by users or it's really complicated so they started developing macro packages to make life easier for authors. First of this is LaTeX, man, much used today, you will probably have
02:45
heard of it or know it or have used it by Leslie Lampert initially from the 80s till today. And then there's another macro package, they started developing it in the 90s and it's still developed today by a Dutch company called Brakma Advanced Document Engineering, has a slightly
03:06
different approach than LaTeX. Yeah, that's just to introduce it. So why do we use a TAC-based solution? First of all, let's start from down there, it's
03:21
open source, it's multi-platform, it works on like everything from a Windows PC to a toaster. It's customizable, if you know how to program the system you can do really a lot of things with it and last but not least, it gives us high quality typesetting, that means we have micro typographic extensions,
03:43
merging, extension, kerning, tracking, all those type of graphic stuff that programmers usually don't care about but readers do. And it has the Knuth plus line breaking algorithm to distribute the white space in an optical way among
04:03
paragraphs. Forget about these things, you will see it in the output or if you don't see it, it's actually a good thing. So context is on top of that, I've said it. The good thing here is it's always a bit like a
04:21
comparison to LaTeX because most people ask me why don't you just use LaTeX for it? Well that's why context gives us a consistent interface, it's not like you use packages in LaTeX for all the tasks. If you Google for LaTeX based solutions it's always use this package and then you have these
04:42
commands and it's commands always differ between packages, it's not really consistent. Context is developed by one company so it's more monolithic, it's one interface and you have commands that are actually predictable. If you know how to style one element you will probably also know how to style
05:01
another. Then PDFA is possible which is a big selling point and we can process XML just out of the box. No XSLT required, no additional software, you just write your style sheets in the context language and then you go straight through.
05:24
So that's how it looks like, it's a sample context document. Those of you who know LaTeX will certainly recognize it, we have backslash all over the place, braces, brackets, so it's more or less more of the same. The differences are
05:42
subtle, for example you have a start text command down there, line 17 and line 29 stop text instead of begin document, end document, like small differences, syntax is slightly different but the other thing is if you look at the like the preamble over here, you do everything with these setup commands
06:05
and they work more or less the same for each element. This is very nice, as I've said before, you don't load packages, you don't have to just adjust the output using rather strange, no that's not strange, strange is too
06:21
strong, but two commands that are used only in the context of one particular package, you can do everything with the same commands, that is what I want to say. So what do I need now? First we need an XML input file obviously, the second is we need context style setups like those things
06:45
up here, much more of them, and then we need the mapping to map XML elements to context and this is actually quite similar to what you would do with an XLT template just to do it in one tool. So we have here a sample
07:04
minimaljets XML article, we have the front meta over here which is collapsed, those of you in the front can actually see it, then we have section element, paragraphs with italics, list elements, bulleted list, a display quote
07:24
down here, another section heading, footnotes as well. So what do we do now? In our setup file we have like like minimal setup like this, it's just an excerpt, it's not everything, but what we do is we start a new setup that I
07:45
call ixmljets setups where I just say okay which elements do I want, nothing, so that nothing comes in that I don't want and then I select all those elements that I really really need. So I start with article, front, body,
08:01
back and I assign those to corresponding additional macros or setups that have the prefix XML colon and then for example XML colon body will render the body. The same with bold and italics down there. At the end I have
08:23
to register this setup and then we can go on. So let's start with the body element. We have again a setup XML body and the main thing is here we just flush it through. There's this whole element that you just pass it through to
08:42
context to handle it. But before that we have an additional macro to handle the front matter, title, author, title page, ISSN number, whatever you want you can just pass it through here. Next thing is the paragraph. Again pretty simple, XMLP, flush it through, quite simple, but the interesting thing is here
09:09
you can use a filter for example to check if there's a language attribute. If we need to change hyphenation patterns for example, so we check that and if they are there we use another command to pipe them in somehow.
09:25
I won't show it in detail but it's just I use another command to check what is there to map it into context syntax. So we go on. At the end we add manually a paragraph break to start the next paragraph
09:40
afterwards. We do the same thing with italics and bold. Just take the element, flush it through, wrap it in a group like with this braces, add the necessary commands, emphasis, bold and you're actually done. You do this with
10:02
every element you have in a usual chat file and you end up with this. You define like in one file you will define the article layout. This is up here the front meta I've been talking about and then we style the other
10:22
elements, we take them over and it's actually a rather painless workflow once it's set up. So your questions. Should I use this? Why should I use this? The answer is obviously well maybe. It works so that's yes. It meets
10:46
our requirements. So if your requirements are similar, if you want PDF A, no additional tools, nothing to pay for like antenna house or like these rather cost-intensive solutions, that's fine. But of course it's not a
11:07
tool in the tool chain. Someone needs to master it. You need to have someone who is familiar with these kind of things. I have written my PhD thesis in LaTeX so I'm actually coming from that world. So I had to actually adapt
11:24
to the XML and to get that over but if you have someone who is used to process XML files, maybe a different story. Yeah, these are the two drawbacks I think. You need to have someone who really, really, really knows these kind of things, is familiar with it and you have another
11:43
tool which might break. I'm not saying it will break but it's another dependency so to say. Yeah, that's it for mine.