Merken

Author Generated JATS XML Markup

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
on the moon and thanks for having us here is the sound all the way go back there have have what good I well what I wanna do in the 1st few minutes is bore you with this story how this is all began by the physician and as a physician i can you know tend to look at medical things and there was a day that I thought and and worry 1st days of the Internet that I wanted to indicate my students and so this is part of the story how we can came about but
started even 5 years earlier than that in 1990 uh some of my family members and I got together and we you know this was the heydays of CompuServe remember those startling speeds of several what should to the wearing a watch out there I'm so you 7 kilobits a 2nd it took us about that I would say half an hour to download 1 image In that case of a car that we tried to sell so we had a perfect internet company ecommerce company going in 1991 in fact we never sold entire cosmo was solved um like parts and somehow turned out to be like that because remember at those times you all didn't have a computer on your desktop and so when we went to car dealers listed in have desktops we had to sell'em the computer to give me lecture on how to use the computer how to turn it on and off and have to go into the system in use computer there was no internet besides if you worked at the bigger academic institutions and so really we close the company 94 run out of money like start up to talk about timing and 95 the Internet came along and even in 94 where the full staffed office were making money and all that and not enough we could have sold the company to venture capelin in Silicon Valley
probably hundreds of millions and I would not be here you would be talking to you know some of these remote sensing some of beach somewhere on reading but anyway I'm so I'm a physician so I was busy doing my medical staff and is 1 of the reasons why we were so naive about the real world and so can the day that I saw the k wood-burning this car selling or parked car parts on a company that I could use those images and you know to tell my students what to do my residence so I set up a page at the server of Baylor College of Medicine with 1 article that I put up and that said you know by the end of the week I want you all to warrant is to look at that it will make a journal club the week that ended and nobody looked at it so long as I can I was frustrating these were really the 1st days of the Internet with the 1st browser was browser out there but then the guy who run the same server at Baylor came complete excited so you know where the College of Medicine got like 20 hits last week and you got like 2 100 with that 1 article so I said you can use a model there's so let's go away from resident education and put it out there to the world and we basically at that time launched our 1st journal and it was called international offences elegy because I'm
nancy ologists and so you know from their own we grew we added another durable reformed the company and then you can just adding stuff but we never required anyone to jump to groups to register were open access on get you know they 1 I was still lot like that today there's no registration business subscription fee as things like that so interestingly 0 some people would send me a letter from where they were and i at nite would sit down and we type that articles so we could put in online it was kind of interesting and then I can you know ask them to be submitted as a word document
or learn how to use them was Mosaic browser you know you remember all that you all the geeks here right and I use the Netscape browser tool remember a look like a target in the and and things like that and then from Microsoft FrontPage came along and man doesn't really cool compared to what I was doing before but there was hooking up at some people at University of Athabasca when you a little bit more about you have these kind of things and this is when I met this at the time 16 year-old little not right and so somehow we hooked up when he was still a little smaller flow less all and I am he was all
thinking already in SGML XML lighting even know what that was so ever since we work together and he's going to tell you all this XML stuff that I have no clue I just telling you please do it and it's done an instance that's how it works the these things the so we currently we we expanded through
82 journals over you know 10 years or something like that or even more now we're like 15 16 year old publishing house all medical journals 82 titles and we have now created about 3 years ago on article submission system which is separate that's a separate websites and we are just about to launch we better passing right now yet the next version of this article submission system and we're gonna touch on things that you would stay away it's like poison to you when I say author-generated itself when a like the I we gonna let our orders to the stuff to the work and we're gonna see what crap we're gonna get and how we can I not so it's probably the next Chatsworth prices on called the the
craps crime breaks now I don't know if that we really decided that I think it's going to be for us right cost-efficient if we can automate even all the mistakes that orders do so to a certain extent and fixed at on an automated fashion and he will tell you how we going to do that so but that's really I wanna give it over to him and you know he deserves all
occluders were we doing because he's really the brain behind it with and it's all yours note to
thank you OK so well are jet-setter what is that I guess it represents a move from us instead of doing all the market analysis using Word macros to having a web apparatus can do all that that for the market themselves using what you see is what you get the i for this little bit of back story we base it on well we know PHP analysis we built in PHP and Javascript and particularly the framework is simply to you and
since the most popular 1 so there's a lot of developers out there if you choose to use it probably both very easily extended it's not it's not all proprietary I guess the main thing is it's easy use of self formed based on for a majority of jets it's just input fields that you copy and paste here the here it's very linear work for its beginning to
end just bunch menu items and you copy and paste OK world and yeah we have
the different steps we have the header like the front matter now with just but we header where I'm the title and author information on all the metadata goes many of the body markup reviews Microsoft Word and um the some word was he built 10 years ago to apply takes 2 bonds word cells the last step of MME is just converting the X number Jackson so unfortunately were still lower but going into this new workflow now but we still do it this way using macros from 2001 then so it's kind of how it looks
when you have your document I'm sure you have tedious sisters that men and have that the
the seems to affect the the that's what thank you your words should be out of the room
when using
is going out so poor people learn all this so what we want to raise that was always 1 of my models something went the of about 6 months something like that you don't mind set on the ground in 1 month so at this point we get an article on 1 so it's like the within an article what I do once it gets accepted and pay words than on the which 1 of our business models this is the only way that the problem is of it but what we get an article on people in the this book is only when we get an article the way that works is so I idea once it paid for exactly pay for all that out take it and I just make sure as a physician and medical content that you know kind of the tables OK sometimes I find pictures that look like they don't belong to that article and finds plagiarized and so on so well things like that so I try to weed out a lot of that stuff and then we send it off somebody puts in a year and this is not a place that is so you don't I will use it is we have these 2 the title countries into the the abstract and these article paid for all the art offers we have to 1 by 1 pick the surname given names and honorifics their address the country all that stuff it just takes a long time it's OK if if you do wonder to what time but if you're doing doing it for hours to start not seen straight anymore and start cutting pasting the 1st name a last name a so yet areas introduced so there's gonna show you how boring it is but it's quite boring it's
time-consuming and he would take us maybe not comparatively think always efficient in this way like you could do an article in half an hour something like that but being an all part-timers that sometimes still too much time and I'm if we are part timers just doing data entry than there is there is the stakes there's no validation we will find out of our problem to laughter comes 6 months down the road says Armenians been run for 6 months so the like in this takes a long time to publish sometimes 3 weeks 4 weeks and then if there is a corrections be made it takes again a couple weeks so we decide
sites which now to this offer generated the jets markup because well adjusted of all those problems the we obviously can support the whole spec because huge and there's just simply too much and too much we say contradiction and if we of if we let the be off do whatever they want we would be able to display its Properties we use what we so we support a subset so we look at their current that article corpus and we just looked at uh what kind of markup reusing right now for example in the title the article title tag what we have indirect knowledge we need to have everything that jet supports in in the editor and the answer is no so we just looked at how much we can offset the mark to the author but still get something that's quality that from them without giving too much choice the so we support of the on blue 3 . 0 some of the latest so thank it yeah it looks like we again and the because we have the metadata support and then we have um I guess this level stuff you know enlighten block-level elements that the other can kind of market the don't how they see me at reserve that for the abstract in the body in the appendices and but the we limit each the input to you I'm having lighter blocks the title would always be in line level so we have a clearly defined set of tags that we allow in those fields the yes so without line just whatever is easy enough for us to support because the area is based on the shield on because we just to cost-prohibitive to have an next other on the front end for office use
it's kind of confusing I mean I don't like using X. motors really either because there's so much going on so so almost the standard the presentation layer of 1096 text we can support like set so it's based on
each node down so we can't really do nested structures and block like section but that's OK because we can still use XSLT 2 groupings to some inferred that structure from whatever we have not done the channel diameter the the the you know lining of and but the word solitaire like as he here the or of tiny and see something like that that's how it works and so we can box text in the figures graphics preformatted tables so pretty much anything but we limit just to keep it us and for the but this is an example like of a lot in just a lot of the metadata jets can be so simply collected by the data be collected by using just a simple text area with some some the markup options so and that's essentially how it would look good and we collect contributors were pretty flexible um you have to be single authors or a collaborative groups like and some of some of the I guess government groups that collaborate from the papers so have most the jets contribute was supported and then for often also offered by was in line level blocking formatting is supported the keywords we just collect and right now we the are trying to enforce the laws and the force constraint on them form being valid match entries but they can be any if you want to apply constraints keywords you can despite listed whatever uh source you wanted it'll live in as for other uh metadata and particle matter we have we said article ID is often notes supplemental content and the grants funding information article history permissions I was thinking about maybe extending the article history section a little bit so that uh yeah you can include some x
if there so the inter itself can keep track of changes are made in the article itself because yeah it it it takes an old blue x million and spit tobacco itself there was a a track record that was all including the Melvin it be good for the this accounting for changing articles if you have no other means of storing information what we support in the abstract the
body in Appendices is kind of moving um this because there's so much that can be supported we don't do anything with nothing alright now or equation supporters because maybe 1 % of our articles in fact come in with any equations and if that happens will just mark up by hand or taught me captures an image of it needs to be the but right now what we support the window In those so 4 sections as promotion and i % of what we need the the so
what is it with the rig ht military based on regional uh for Donna so how we converted to you just from the HTML is um the just using XSLT 2 grouping specifically for the 1st NASA sections so that we can have each 1 takes Nietzsche to text HTML end up being mapped properly 2 sections in in JET so that you can have an the a discussion with subsections inhabit that end up being valid XML laughter 1 tell them that use a lot of regular expressions all over the place to um transform the data between each known jets but essentially what areas of pulls of file runs in data transform respond to data and then I'm displays at an anything has changed it's you transform that the X the so there always has to be some sort of a mapping possible whether it's by using class attributes so uh well hottest or ID ID attributes that that there has to be some kind of a mapping if there's not that now that different method has been devised to collected but then right now what image figures are
handled via out band follicles on a separate page so that we can collect a large images otherwise uh draining dropping using the theater itself to handle image uploads would we be fully was support that because you want to collect the highest quality images that we can sort the 30 megabytes or 50 megabytes uncompressed if but we don't want that the 2nd user experience by death just talking about the the present time while the editing the document the and right now will accept people so as an image image but working on making tool the better so that deals he's the kind of cut and
paste from word into the the weather and have a capture properly without any messy formatting and still want to look at as well the videos and other media types we don't linear yet they're just generally too large and too rare having we get with the was here's something so I'm not a big few support at this point all for situation handling of
in in the body attacks which is the place that we've implemented this annotation tool and it can be added to the title as well but in the text and you simply highlight the handover reference and it'll resulted from the back matter and ask you if you would like to link to this particular and that's URI have evaporated so it's really quick it's not difficult at the if the author has like a 10 reference number at the end of the sentence it is highlighted will save you mean the I don't tend not matter that you would it links and automatic with the 1 the battery support
all the top level a better elements like Acknowledgements I'm with the notes the content I can be set so that if there is any arbitrary not that you need to include the still the retaining of the node then as you can FIL uh into
that matter they get this when the hardest parts of getting the author to submit their or market on XML is that that's tokenized and we can't expect an offer to sit there and um 10 based each of the 150 and also the author name into into some form so we do is we have to suffer from the giant text area where the authors piece the entire citation section and it exploded by line and we search that for some kind of identifier admitted service so we look for PNC ideas and deal wise right now and if there is a match the metadata then we just disregard it the author gave us all together and pull it from that negative source and now we know we know it's correct and useful problems but if there is a no uh identifier then we use the same annotation tool and basically all you have to do is pilots in text and it asks you to define what the text is a gives a stock adjacent objects to it works really mean simply and doesn't take much time at all on and before they can submit all the all the and no promise has to be solved if there are any so can would
that with this kind of as a simple example but if you were a highlight the name here you we get you know so window here we can choose what type of data so In the 6 this is up here we got hold it from the from crossing but if that was not there then it would take that long to to it to to organizer of string but the the other day and this is
with this with a mapping again and it's we can really support anything we define a unique way to inferred from the DAG melodramatic snow so that matches with the of transformers map the redefined it's not a problem to collect that information and it's
therefore budgeting things go wrong that obviously the 1st use we against the schema to gene if there's a problem at that point to then they require staff intervention to go and find out the XML is about valid 1 the since were collected from the officers partly chance of that and maybe put entire paragraphs as hidden text upon or section titles in which case we can really check for that uh that well so that's when staff would come in and take a quick look for gets passed off to call a peer review and that of course copy in and she 5 and there's well where we can just go back and fix any problems after the paper the 1st some some sort of problems like piano problems if uh we decide it's too much of too much work for the offer to go and fix any no problems we were thinking that Amazon Mechanical Turk which is a platform for Krieger human intelligence tasks that you can submit and they'll have somebody some human going for us do they do a task for you and the cost 5 sensor whatever that the person's charge into the this task so if you have about formatted and no new which creates a where these tasks send it out in a couple people would do it and if there the information comes with the yeah the
result of everyone's uh but work is the same than you know it's a you probably considering that the work was done properly and the think the good thing about is 24 7 so people working 24 hours a day on on whatever task so if you're giving given article and you wanna get published within a week or something like that you want to wait for like our to common 6 of an alternating like that we can take him cannot being put into the pipeline the but I know that I think of and it has its units the this to
summarize really what we're doing is we're touching the hot potato right we allowing artists to start putting in data into our system
and it's gonna be all time will tell us and who has done that here in the room and it's potato right you would even touch it and probably most of the people because you know we we did that early on with some other things and really I mean artists as as we said before there's a whole variety of what art is considered to be good good you know I mean if you go from from this and that and then and but us it's really how can we outsource things cost-effectively with the minimum amount of staff intervention to really capture stuff before it gets published and when we look at things right now and the year I'm sure every publish you everyone that is involved in all publishing and using these whatever XML as in the eighties and all that stuff how you call that you going figure out there is always a problem with the handouts there's always a problem with images there is always a problem with the tables but I think the images were pretty much solved I guess probably most of the people that the tables is still a mass so we need to figure out what our work on that I had a nice ways to capture tables as they are doing your publishing was an image but again all other people or other or to submit crappy tables and you being a table format that made with excel or something or they just steal tap tap tap them out and then they change the font and all the taps off right something like that so we have to figure out if know for tables if would still need to go in there and do some hand work or we can leave it up to them and just say hey this is our requirements and you can resubmit until we accept it you know so that it wouldn't have to play with the system and again the the last really big problem was the and note the references and I think you know he showed it to me this little tool that he an automated script he wrote and it really works very nicely cell we're again we're gonna collect some information on that but clearly we outsourcing basically on the front end to the author because when we look at our corrections afterward published 95 % of those corrections or e to the name of the 1st table the last name where we cannot identify you know being here if you say few Corio named Frank Smith we can know that Frank is probably 4 statements miss your last name but when you get all the scanner names from around the world you have difficulties and the way we we we use and the the name when we publish it we have you know the the 1st letter of the 1st name and then the full last name and so sometimes it just reverse it and then we have the full force 7 the 1st letter of the last name if we don't know which 1 is which 1 so we hope that artists will know their names it still doesn't tell us that they gonna end to the 1st name in the field for 1st name and the last aiming in the feel for last name but if they do so it's their mistake and we you know they cannot come back to us and say you made a mistake because isolated it correctly so about 19 really 90 95 % of our corrections have to do with names or maybe with affiliation and now we have those kind of comics that every 2 months they to get fired the change of place because I don't a wide and then there would be no request from us so we keep on changing their affiliation and you know the model with the with tell OK you have to pay for that the don't ask us anymore so that's cool that that solves itself about so really on the front end we really can't outsourced to the altar than we do our work uh in and deal a mistake this is an old slide set that we we just did another 1 an hour ago when was that in making here but the this thing is we do the peer review before the copy editing OK so once and articles peer-reviewed accepted and paid for Denny goes to copy editing and we do we we have probably about 10 medical students and nursing students and the empties analysis that do this kind of copy editing for us so they all people from the medical field I'm and then he goes out and then on the back and for those of you know still problems so we don't have we don't try this at Amazon Turk were we gonna say OK we still have a few problems if we solve a scholar causes a dollar 2 dollars per problem so what other put it out there and Weber's wherever in the world and things that 205 attend sense a good salary to do that you know the our guest and do it and if we get twice the results back more or less the same way we think it's probably valid wouldn't use that so all I'm sure the published some mistakes I'm sure you know there's gonna be errors in there it's going to be a little trial-and-error thing on until we can finalize this a touch of how potato we're gonna we're gonna work with that because we think it's very cost-effective it's gonna be right facets Wednesday was probably tree of 4 weeks in a publication process and probably the longest that force will still be the proofreading and that so With that said I think you know we we we thought it's probably going to open a few questions because you know I think you all have stayed away with reason from author-generated XML so with that will invite any questions you might have an eye on a road with all the technical stuff I just push it right away to you guys in the body of something of a yeah OK the licensing so what are we going to do with this you know so I'm gonna have medical brain have business brain he he gives everything away he's gonna do that kind of time so we're gonna we're gonna try to figure out what the truth is in between the 2 of us so we might create and how you call this the Italian like but I get up in the name of bayesian but if you know the top I didn't know what it was but I was so you give away something and people it's open source they come in they improve it and give it back to you so maybe we'll do a thing like that maybe we if it really works out let's say would play around for sure for about 6 months with that so we have enough here we have hundreds of article the flu true and then somehow we don't have to go back and make an analysis of that and see really did we publish garbage and not so I hope not and and if it really poor you know proves to be a valid tool then we might consider some licensing and licensing could be you you realize it's the software to somebody wants to use it on their own so the headache with that is wouldn't have to follow up with surveys and then you know new versions and stuff like that or we create a platform and our you know we were by another separate server server bank and we invite people to common use our platform and pay for the platform not necessarily for the software where we just maintain the cell for all the time on our platform and you could pay possibly like you know anywhere between mobile a dollar 10 dollars article depending where you just wanna use the tool to generate it's a melanin take XML back to your own server and you publishing platform or whether you wanna use us as a hostess well if you draw because we're getting all the time and request that we should add another journal or newsletter for academic institutions or whatever it is and I have rejected them all over the last 10 years or 15 years because we have enough to do to maintain our own grows and to do our own in-house staff but where the point where almost everything is almost automated so at this point we can start looking at maybe doing stuff for people as well and and so these are kind of the models and we can be any mixture of those in the so don't know it will will 1st play or because the start taking money from other people you wanna make sure you deliver and you deliver good stuff I mean that's kind of my philosophy and so you know I was gonna which you know pushed away some even some very good deals just because I thought we could probably not the liberal with time so once you ready for that it means maybe we're gonna hire a few more staff people you know full time and then I will deliver something really meaningful but I'm not sure if you wanna go there I mean right now we may foldaway the wages the way for all of those ones all right any other questions are you any questions because I got to point David show frosted and that 1 is to she stressed that if you do want to the cost-effective further development on this wonderful product that you are developing then doing the open-source through is very very cost-effective ways to build a community around you because I'm sure there are many people who would like to contribute to this I think and I think you're right on and that really plays into what he's preaching me and I I have an opening year for what he's doing what we could do is we could develop the tool in an open source form and if any 1 of you wants to collaborate in some way shape or form you know and this is his address the e-mail address please contact him and what we may
do then is to take that open source tool and put it on our platform and and people could just pay as for platform use if they want to right that's an easy way it will be a very cheap way for somebody wants to put out the little Lilja or you know newsletter and doesn't wanna
go to the whole who POV whatever having services and all that think it's a one-stop solution and it could be a very low cost one-stop solution the the other point I want to make is about you them tables as images last week I was at the European Bioinformatics Institute at a workshop on literature and data which was looking at how journal publishers treat data pulse is 1 of the leading journal publishers in the by medical field Public Library of Science and they do a very curious thing they they each of the the figures and tables with the deal aligned publish it should citable but then you can download it for free only as an image In other words the data are not actionable you can't do anything with them except retype them into a spreadsheet this is absolutely the worst thing in the world you do and I'm very pleased to say that also announced meeting that they were changing their policy and now they will be publishing table data as actionable number numbers it's not as images so I would include a encourage you not to go down a group of putting tables into images and essentially um forcing the readers to do have to be what Peter Murray-Rust described as Jenny hamburgers back in into columns the well that's that's right term but we have that 13 years of articles now where the table 0 images so that this is maybe a tassel be suitable for about the mechanical Turk there uh forgive somebody in image of a table of you transform it at your authors have them as spreadsheets already yeah that and to submit the numbers to you know images that that's plus a trade off right I mean we can make them more to when they're being a system at the group so that that's the trade off I mean I totally agree that that and 1 out on half I would say of our tables do not come in as trading in the spreadsheet your created with word tap tap tap it's a mess it's an absolute mess up beyond our me laugh so you really know what I'm talking about right now we we we are not part of cross all our articles you have you eyes and we are in the process of getting wise for every image and then we're going to categorize and so it's going to be easy to access images like in his out we have images really know that order p we accept case reports that normal journals don't but said is a case report of tumor coming out of the year normal journal would say they we have the seen is like 20 times were gonna read not republished but we we see stuff like in Africa I wouldn't have time for the last 15 years to see a doctor and now this tumor has grown like to the forward looks like an elephant you know kind of thing and we have pictures there really absolutely mind-blowing for this century this time of the century and so I think is a lot of value in these images with somehow categorize and make them available in some AP last format I don't know how we still have a big question mark about the tables because again this is sizing 1 of the biggest crappy submission pipes are 2 tables and you know the end notes is really the 2 problems like problem was if you can 4 and the other problem we have which we have not touched at all is plagiarism and and so we we're gonna start using cross check your problem within the month stand for the changes we talked about you know so we can make an annotation when was was change woman start using cross marks that is no way for us so no reason for us to reinvent the wheel when they're perfect tools out there that are being offered so it's just for us how to incorporate these tools that make you know the flow make seamlessly and it doesn't take us too much effort again so we need to automate all these things but plagiarism and I thought I mean this is for you to problem down the I mean I we were talking at the lunch about this you know people so submitting articles they don't even go to the effort to change the font and the caller from where they copied it right so we get articles would like tree for different fonts we know right away it was copied and pasted from different sources Wikipedia the free sources the I have a couple questions the 1st amines Jeff from from national this year is this in production now is interesting it is interesting to think and my other question is you you mentioned copy-editing those that happened before or after the conversion to XML that after OK good and you had a you have 1 of your slides where you had your yeah html domineered jets x amount heroin both ways that's pretty exciting is go both ways yeah that's a lot of data transformer so obviously doesn't do everything in 1 of the some weighted FIL so if you started with the with the jet xml article that conform to your subset you'd be it will bring it back into your system under the editing and yet that it's like a black box you throw the XML and its and I'll show you the won't show you in the output the you have and then when you click Save from the X number of OK great and 1 way to look at that if I'm an altar and I start putting on I'm going all this work now that I'm required to and we're still 1 of the lowest alter fees and we we ask about 2 and 20 bucks and then probably 60 % of our artist at a huge discount because from the from this old world so it's not really compared to ours is very low so it's still you know it it might be a lot of money for people from the the world still I would just tell a mommy hate look this is how it's gonna look like so they can a look at before it's even published away it's gonna look like when it's published so we think that's critical because they can they can stop playing around with it maybe change a few things if they're not happy and we don't have to do that at the back and great alignment questions the and then 1 into long and indepen a consultant at a question about that black box and so how you heading handling special characters on the dimension tiny in C and C editor which I believe both use the itional named character yeah I see references we don't we don't use out of is the then we don't use either those actions tho but it sounds like that but it also musical characters and into the axonal entities the so you're using a code point in the references character entity references point warrior by using the actual Mac could point rough in assist you are OK great sorry yeah that that question just when like Bush commit Kevin happens remember was saying resolution library I might forgive me if I've still this but I'm I'm trying to measure the interface and of course because it may be evidence in general and so the user fills out some of these metadata fields but then for the body of the article can they I going to the were docking and we already wrote it and select all copy and paste Will the editor with and was he we get a handle this board to become any decompose within it might be heard from a node on last year for the most part it's the head of the properly but there's tables in there it might not be the same that's why I said right now we're so accepted that the table images but ideally it should be a cut and paste some text yeah I wanted to convince him that we need a form for everything like you know if it's a case report goes described with the with the abstracted keywords the introduction the case report the conclusion and the references in if it's an origin article it as the methods and materials and then the statistics and all that but he convinced me just 1 walk put it in there and when you look at the submissions I mean again I think will have problem with maybe to a tree out of a you know maybe to a tree has a when the 1 % will have to hand correct some certain things but again you know it's just a question would you would you have some staff working on 100 per cent of your articles we just gonna let it run through a look at it that the end then just correct 1 and 2 cent I would have you know maybe in a year 2 will be back here and will tell you it worked or would tell you it was simply fail under know it but that's a nice thing was small we fast we can turn left right if it fails to more nite I can say OK we go back to this 1 and then however a little sleepy gets and gets program then we're let government England president of the reminder of fj few
Internetworking
Mereologie
t-Test
Familie <Mathematik>
Physikalisches System
Computer
Computeranimation
Internetworking
Office-Paket
Spezialrechner
Perfekte Gruppe
Mereologie
Taylor-Reihe
Vorlesung/Konferenz
Bildgebendes Verfahren
Internetworking
Pay-TV
Browser
Stab
t-Test
Gruppenkeim
E-Mail
Computeranimation
Internetworking
Homepage
Zeichenkette
Registrierung <Bildverarbeitung>
Datentyp
Produkt <Mathematik>
Server
Vorlesung/Konferenz
Addition
Registrierung <Bildverarbeitung>
Bildgebendes Verfahren
Zeichenkette
Lesen <Datenverarbeitung>
Zeichenkette
SGML
Bit
Browser
Browser
Vorlesung/Konferenz
SGML
Versionsverwaltung
Datenfluss
Menge
Computeranimation
Instantiierung
Metropolitan area network
Physikalisches System
Maschinenschreiben
Web Site
Front-End <Software>
Auszeichnungssprache
Versionsverwaltung
Kontrollstruktur
Vorlesung/Konferenz
Physikalisches System
Maßerweiterung
Ordnung <Mathematik>
Computeranimation
Lineare Abbildung
Bildschirmmaske
Benutzerbeteiligung
Bit
Texteditor
Auszeichnungssprache
Makrobefehl
Framework <Informatik>
Computeranimation
Gammafunktion
Zeichenkette
Analysis
Lineare Abbildung
Bildschirmmaske
Datenfeld
Texteditor
Auszeichnungssprache
Vorlesung/Konferenz
Ein-Ausgabe
Softwareentwickler
Computeranimation
Autorisierung
Beschreibungssprache
Zahlenbereich
Zellularer Automat
E-Mail
Computeranimation
Zeichenkette
Metadaten
Makrobefehl
Auszeichnungssprache
Information
E-Mail
Makrobefehl
Zeichenkette
Punkt
Flächeninhalt
Adressraum
Produkt <Mathematik>
Vorlesung/Konferenz
Inhalt <Mathematik>
Unternehmensmodell
Computeranimation
Gammafunktion
Tabelle <Informatik>
Zeichenkette
Web Site
Beschreibungssprache
Darstellungsschicht
Element <Mathematik>
Analysis
Computeranimation
Übergang
Metadaten
Statistische Analyse
Produkt <Mathematik>
Auswahlaxiom
Gerade
Gammafunktion
Autorisierung
Kategorie <Mathematik>
Validität
Übergang
p-Block
Ein-Ausgabe
Office-Paket
Teilmenge
Texteditor
Datenfeld
Flächeninhalt
Menge
Debugging
Mereologie
Auszeichnungssprache
p-Block
Mereologie
Texteditor
Beschreibungssprache
Gruppenkeim
Nebenbedingung
Gesetz <Physik>
Computeranimation
Übergang
Metadaten
Vorlesung/Konferenz
Figurierte Zahl
Gerade
Element <Gruppentheorie>
Mixed Reality
Übergang
Quellcode
p-Block
Konfiguration <Informatik>
Forcing
Datenstruktur
Garbentheorie
Auszeichnungssprache
Dateiformat
Garbentheorie
Information
p-Block
Standardabweichung
Tabelle <Informatik>
Zeichenkette
Nebenbedingung
Quader
Polygonnetz
Content <Internet>
Kombinatorische Gruppentheorie
Solitärspiel
Knotenmenge
Mailing-Liste
Inhalt <Mathematik>
Datenstruktur
Gammafunktion
Meta-Tag
Autorisierung
Matching <Graphentheorie>
Durchmesser
Validität
Einfache Genauigkeit
Flächeninhalt
Hydrostatischer Antrieb
Partikelsystem
Arithmetisches Mittel
Metropolitan area network
Datensatz
Weg <Topologie>
Teilmenge
Bildschirmfenster
Mathematisierung
Mixed Reality
Gleichungssystem
Garbentheorie
Information
Abstraktionsebene
Bildgebendes Verfahren
Computeranimation
Teilmenge
Texteditor
Datensichtgerät
Klasse <Mathematik>
Regulärer Ausdruck
Computeranimation
Homepage
Hypermedia
Spezialrechner
Umwandlungsenthalpie
Typentheorie
Gruppe <Mathematik>
Endogene Variable
Figurierte Zahl
Schnitt <Graphentheorie>
Bildgebendes Verfahren
Attributierte Grammatik
Tabelle <Informatik>
Motion Capturing
Gruppe <Mathematik>
Transformation <Mathematik>
Validität
Elektronische Publikation
Dateiformat
Quick-Sort
Mapping <Computergraphik>
Regulärer Ausdruck
Zeichenkette
Flächeninhalt
Garbentheorie
Autorisierung
Punkt
Auflösung <Mathematik>
Applet
Zahlenbereich
Computeranimation
Videokonferenz
Motion Capturing
Datentyp
Hypermedia
Dateiformat
Vorlesung/Konferenz
Skript <Programm>
Zeichenkette
Gammafunktion
Metadaten
Element <Mathematik>
Regulärer Ausdruck
Element <Mathematik>
Dienst <Informatik>
Computeranimation
Übergang
Metadaten
Knotenmenge
Bildschirmmaske
Skript <Programm>
Inhalt <Mathematik>
Ganze Funktion
Gerade
Autorisierung
Matching <Graphentheorie>
Applet
Quellcode
Objekt <Kategorie>
Zeichenkette
Dienst <Informatik>
Flächeninhalt
Mereologie
Attributierte Grammatik
Identifizierbarkeit
Garbentheorie
Transformation <Mathematik>
Selbst organisierendes System
Browser
Element <Gruppentheorie>
Content <Internet>
Übergang
Transformation <Mathematik>
Computeranimation
Mapping <Computergraphik>
Typentheorie
Datentyp
Bildschirmfenster
Information
p-Block
Zeichenkette
Gammafunktion
Cross-site scripting
Resultante
Webforum
Punkt
Stab
Peer-to-Peer-Netz
Systemplattform
Quick-Sort
Computeranimation
Office-Paket
Task
Task
Einheit <Mathematik>
Einheit <Mathematik>
Vorlesung/Konferenz
Information
Cloud Computing
Ganze Funktion
Gammafunktion
Peer-to-Peer-Netz
Resultante
Punkt
Prozess <Physik>
Extrempunkt
Adressraum
Versionsverwaltung
t-Test
Newsletter
Sondierung
Computeranimation
Eins
Netzwerktopologie
Font
Skript <Programm>
Figurierte Zahl
E-Mail
Managementinformationssystem
Befehl <Informatik>
Shape <Informatik>
Ruhmasse
Störungstheorie
Biprodukt
Rechenschieber
Zusammengesetzte Verteilung
Datenfeld
Forcing
Rechter Winkel
Server
Dateiformat
Information
Tabelle <Informatik>
Fehlermeldung
Varietät <Mathematik>
Maschinenschreiben
Stab
Mathematisierung
Zellularer Automat
Systemplattform
Bildschirmmaske
Software
Konstante
Produkt <Mathematik>
Softwareentwickler
Bildgebendes Verfahren
Analysis
Trennungsaxiom
Autorisierung
Schaltwerk
Open Source
Validität
Physikalisches System
Offene Menge
Debugging
Umsetzung <Informatik>
Prozess <Physik>
Punkt
Blackbox
Gruppenkeim
Newsletter
Computeranimation
Netzwerktopologie
Metadaten
Puls <Technik>
Font
Vorlesung/Konferenz
Figurierte Zahl
Funktion <Mathematik>
Schnittstelle
Statistik
Güte der Anpassung
Quellcode
Teilmenge
Rechenschieber
Helmholtz-Zerlegung
Texteditor
Dienst <Informatik>
Datenfeld
Verbandstheorie
Tabellenkalkulation
Rechter Winkel
Dateiformat
Ordnung <Mathematik>
Zeichenkette
Tabelle <Informatik>
Subtraktion
Hausdorff-Dimension
Stab
Gruppenoperation
Mathematisierung
EDV-Beratung
Zahlenbereich
Auflösung <Mathematik>
Transformation <Mathematik>
Joystick
Systemplattform
Term
Code
Whiteboard
Bildschirmmaske
Knotenmenge
Konstante
Produkt <Mathematik>
Programmbibliothek
Optimierung
Bildgebendes Verfahren
Schreib-Lese-Kopf
Autorisierung
Materialisation <Physik>
Open Source
Physikalisches System
Datenfluss
Mereologie
Normalvektor
Verkehrsinformation

Metadaten

Formale Metadaten

Titel Author Generated JATS XML Markup
Serientitel JATS-Con 2012
Teil 10
Anzahl der Teile 16
Autor Gajetzki, Andy
Wenker, Oliver
Lizenz CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/30576
Herausgeber River Valley TV
Erscheinungsjahr 2016
Sprache Englisch
Produktionsjahr 2012
Produktionsort Washington, D.C.

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract At Internet Scientific Publications, we have since day one marked up submitted manuscripts using an in-house developed Microsoft Word macro. After 14 years, we feel that this approach is not ideal for two reasons: 1) most errors that exist in the finished XML are introduced during the data-entry / markup stage, and 2) markup represents a significant time expense for our staff that could be better spent elsewhere. Since we only charge at the point an article is accepted for publication, there is a time investment marking up manuscripts that may never be monetarily recouped. Consequently, we have explored the option of allowing authors to mark up their own documents from our submission frontend website. There are draw-backs to this approach, namely the complexity and completeness of JATS and the huge learning curve a non-technical author would encounter, but we have in-turn concluded that a majority of the JATS definition does not need to be made available to an author in our frontend application. If an article requires more specific markup that we do not support in the application, we can always fallback to publisher side markup using our tried and tested Word macro. Quality control occurs later in the pipeline during copy-editing regardless of which markup pathway is followed.To facilitate this, we have created a self-contained Symfony2 bundle that supports manuscript markup utilizing a subset of the JATS Journal Publishing 3.0 tag suite. Much of the front and back matter is captured using simple form inputs and is validated using regular expressions developed using common input patterns. For the body, an HTML5 DOM based WYSIWYG editor is used. Although the generated markup is HTML5, by using a subset of JATS, we can unambiguously map between the two markup languages. We speculate that Amazon Mechanical Turk could be used to simplify certain article markup tasks like, for example, endnotes, where it would be off-putting for the author to tokenize the citation string. While the distribution model of a final product has not been determined, it will most likely be made available in a dual-licensed manner depending on the commerciality of the customer.

Zugehöriges Material

[Full text] Author Generated JATS XML Markup

Ähnliche Filme

Loading...