Collaborating with Collabora Online
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61704 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
DisintegrationLibrary (computing)Green's functionGoodness of fitCollaborationismDifferent (Kate Ryan album)VideoconferencingLibrary (computing)BefehlsprozessorJava appletProbability density functionQuicksortPresentation of a groupData structureData managementFormal languageComputer fileProjective planeBinary codeMultiplication signGastropod shellScripting languageMedical imagingPoint (geometry)Physical systemOpen sourceWeb browserSynchronizationINTEGRALShared memoryEnterprise architectureOffice suiteWeb 2.0MultilaterationProcess (computing)Software architectureInformationRepresentational state transferAvatar (2009 film)Public-key cryptographyPasswordBookmark (World Wide Web)System identificationCuboidLecture/Conference
05:46
DisintegrationWindowInsertion lossFormal languageMetropolitan area networkFormal grammarPoint (geometry)Interface (computing)QuicksortPay televisionExtreme programmingBitPlug-in (computing)Web 2.0InformationMedical imagingGoodness of fitData structureProduct (business)FreewareSoftwareTemplate (C++)Process (computing)Macro (computer science)NeuroinformatikEnterprise architectureRight angleKey (cryptography)WordOpen sourceInteractive televisionOffice suiteVirtualizationPunched cardNumberINTEGRALComputer fileFrame problemMessage passingRepresentational state transferWeb serviceGroup actionWeb browserLecture/Conference
11:26
Probability density functionUniform boundedness principleRevision controlFormal languageProbability density functionExtension (kinesiology)Structural loadDialectComputer animation
11:48
Computer fileTrailMathematicsWeb pageQuicksortMoment (mathematics)File formatRemote procedure callVirtual machineWindowComputer fontData managementElectronic mailing listProduct (business)WordComputer animation
13:20
Server (computing)Configuration spaceDependent and independent variablesComputer fontSmoothingElectronic mailing listStructural loadBitMedical imagingComputer fileSet (mathematics)SubsetConfiguration spaceComputer fontSlide rulePatch (Unix)Hacker (term)Information securityFitness functionLine (geometry)WordProcess (computing)Remote procedure callDrop (liquid)Shared memorySynchronizationComputer animation
15:44
Computer animation
16:08
Network topologySpreadsheetError messageExecution unitGUI widgetContent (media)Form (programming)Electronic mailing listControl flowHorizonProgrammer (hardware)Probability density functionBeat (acoustics)Social classTable (information)Vertical directionComputer fontKey (cryptography)Standard deviationCellular automatonFunction (mathematics)Data compressionFacebookBefehlsprozessorSingle-precision floating-point formatPasswordGoodness of fitCASE <Informatik>EmailStructural loadData compressionMachine visionClient (computing)outputPixelDirac delta functionTesselationMedical imagingKey (cryptography)Line (geometry)Greatest elementWell-formed formulaProduct (business)FacebookProbability density functionVideo gameComputer fileCalculusBitGraph coloringForm (programming)Computer animation
19:01
VideoconferencingWeb browserCodePrototype2 (number)QuicksortTheoryWind tunnelBitComputer animation
19:23
InternetworkingComputer networkSuite (music)Control flowCellular automatonSoftware developerPrototypeMassComputer animation
19:42
Event horizonHacker (term)Digital photographyLink (knot theory)Computer animation
20:19
Online chatSet (mathematics)Expected valueSlide ruleReal numberINTEGRALOffice suiteGraph coloringMultiplication signWave packetDifferent (Kate Ryan album)Computer configurationComputer animation
21:49
Lattice (order)Maxima and minimaCodeOpen sourceAiry functionEndliche ModelltheorieWave packetLecture/Conference
23:01
Program flowchart
Transcript: English(auto-generated)
00:05
All good. I don't know if this is working, but it's green. Ah, look at you, crikey. Thank you for coming. Look at this. Excitable people. Collaboration. Well, so here we are with Collabra. I hope this is what you want to hear. One of my concerns is that I've given similar talks in the past.
00:23
So, maybe we have lots of time for questions at the end. So, I tried to do some different things this time. So, one thing that we're really eager to do is to get Collabra Online reused in lots of different places. There's lots of innovation going on out there, and lots of people have great ideas of how to use documents
00:43
and make them look better on the web. And we would love to integrate with you to do that. So, one of the things that we're really useful for is converting documents to different formats, which seems like an easy thing to do, you know? But it's really tough.
01:02
And to wrap that up nicely for you, we have this beautiful REST endpoint. And it looks so simple. You just do the curl command. Brilliant. And you ignore certificates. So, you should remove that in deployment. And out of it, you get... Well, what do you get? You get your text file turned into a docx.
01:20
That's an easy one, right? But imagine you wanted to convert a PDF or a PPTX into a PDF, or a very common request is converting PPTX into animated SVG. So, we can do that very nicely. We can produce XHTML out of it that you can run in your browser. That's actually how Collabra Online does its presentation thing. So, you can get animations and presentations.
01:42
You can understand the structure of your arbitrarily horrible, say, binary PowerPoint file. And you can dump that into something you can parse and read and interpret and mash up and do cool stuff with. And the good news about that is that, well, so people do this already in lots of horrible ways. So, I will pick on someone.
02:02
I don't know, is there any... Who do we have in the room? Okay, let me think. What integration do I particularly like? So, there is an unnamed open source project. And when it tries to convert its files to show them in its jitsy-like, remarkably jitsy-like video... No, not jitsy.
02:20
I forget. One of these video conferencing systems. Big blue button, let's call it. It essentially has a shell script. And all of the good, beautiful software architecture stops at the point you want to convert a document. And it launches a shell script which starts a Docker image, which then launches another shell script in the Docker image that copies a file into it with some horrible command
02:41
that then converts it via another shell script that launches an office suite that sits and talks to a, you know, RPC. And then, you know, it's just absolute horror. And if any of this hangs or dies or crashes or burns or uses more CPU or finds that one document that has a real problem, you're just doomed. You know, you have to write all of this lifecycle management nightmare.
03:03
And the good news is, with our beautiful API, you don't need to care about any of that. Deploy the Docker image, job done. If it's too big, we'll time out. If it's too horrible, we'll tell you. And it's all done for you. So, that's kind of good if you want the whole document. Often, though, people have enterprise file sync and share. They want to see their documents.
03:21
You know, they're fed up of seeing an icon. They want to see what's inside it. So, again, we can convert your document to an easy thumbnail, very trivially, nice big image. You can shrink it down to whatever size you like. And that's pretty good. So, we're really eager that people use it everywhere. And so, we've written most of the work for you already,
03:41
so you can use it. I think it's a patchy license. I'm really liberal, you know. I'm more of a copyleft guy, but at least I understand. Other people aren't. So, you know, in the language of choice, we probably missed our Ruby. I'm going to get in trouble with Neil later. But there you go. And we've done a whole lot of features recently.
04:01
So, one of them I was really surprised and encouraged. I was talking to someone from a European office full of lawyers earlier. And you wouldn't believe it, but they really love citations. They're all court cases, of course. I always think academic citations. But, you know, referring to other legal cases, there's this massive, world-wide web of knowledge about what's fair.
04:22
And anyway, so we've done lots of things with Zotero. So, one of the things that you can do, if you have a Collab or online integration now, is to just to push, you know, we added all this citation stuff in the toolbar here. And Nextcloud implemented this. And all you need to do is provide a box somewhere that you can set this API key.
04:41
So, Zotero have a very nice REST API. And then we plug into that, and you send us this, we add this user private info. So, you have a user info which has things like avatars and extended information about users that we send to everyone in the UI. We thought it was best not to send your private key to everyone. So, we added this extra tag, user private info.
05:02
And so, when you connect to Collab or online and embed it in your iframe, you need three methods. Get, so we can get the file and show it and render it. And then a post, so we can send it back again when we've edited it. So, that's the save. And then there's a check file info. And that basically tells us about you. So, who is this person?
05:21
You know, we've got a path thing, a URL for the document, and we've got a opaque identification password to token. But what's their name? Tell me their name. Tell me their, you know, what they look like, their avatar and this sort of thing. And so, you just send this back there and bingo, suddenly you have beautiful integration with all of your citation libraries.
05:42
You don't need to run a Java wrap on the side and then talk, it just works really nicely. And updates all of your citations beautifully with a familiar interface. So, that's kind of nice as a way of integrating sort of two things together through a very simple REST API into a nice UI.
06:02
Yeah, so language tool is something else I love. I don't know if Daniel, he's probably a rich man. Grammarly, you know, has set a price point for, does anyone get Grammarly adverts? Has anyone watched YouTube? Has it been plagued by... Yeah. So, you know, there are many ways to create value in the world. You know, one of them is, of course, to create value.
06:21
The other is to tell people you've created value. And, you know, and I think often as engineers, we forget to tell people that we've created the value. You know, that's the problem. We do it all and then there's no marketing. I think Grammarly is probably the extreme example of, you know, marketing versus value. But anyway, so they can sell somehow for 30 bucks a month,
06:41
50 bucks a month, something like that, a subscription to their web grammar checker that sends all your information to someone else and sends it back, you know, with grammar checking, which is great. But because they've set the price point, there's a great company in Germany, I think based in Potsdam, that make, well, they already made an open source grammar checker.
07:01
They've done the whole create the value bit. And so, but they now sell these lovely plug-ins to people and you can, you know, for much less money, get a better open source product. And they have some of those nice AI things in there, you know. And AI is cool. Of course, for checking an ISBN is valid. Probably not the best use of AI, I might argue.
07:21
But on the other hand, sentence structure and human language, you know, that can be pretty cool. So, they're taking on Grammarly. And the nice thing about that, of course, is you can get a Docker image and you can deploy that in your Kubernetes, you know, whatever, and connect it up to Collabora Online so all of your grammar checking stays in-house. So, you get the benefit of all of that goodness.
07:42
And, you know, from a European free software company, I love it. And they're doing well and they're growing really nicely. So, nice to see that happen. Very easy to set up. And they even document the API nicely, which is kind of cool. So, you can see all of the, you know, number of features exposed in some of the JSON API they have for that URL.
08:04
Again, very simple endpoints, you know, you just send your stuff to check and you get some answers back and we show it. And then, of course, you can configure that as you like. And they have a web service. I mean, another example of just sending text and getting text back is our Depot integration. So, another easy thing there that's often useful for people.
08:23
And, yeah, it's a bit interesting, this. So, obviously, they want to try and retain formatting, which is probably one of the big advantages over pasting it into your web browser. But you can buy an enterprise key for Depot and use that, but you're not going to have it on premise, I guess.
08:42
And then trying to really get styles through HTML and map them back properly is more challenging than you might imagine. And we haven't done a very good job of it. So, if anyone wants to improve that, they're very welcome to, you know, to come and contribute. But I think this idea of, you know, ask the computer to improve my document and stuff, that interaction thing is there and working nicely.
09:02
And there's lots of easy low-hanging fruit if people want to help do cool stuff. So, one of the other things we try and do is we try and integrate outside the iframe. It's interesting. So, you create virtualization, for example, and almost all of the interesting thing about virtualization is the bit that isn't virtualized, you know,
09:20
the bit where you can punch through the magic to not virtualize something and, you know, run the command inside the... Anyway. So, similarly with us, you know, the integration is probably the most interesting bit around the edge. And it's much the best if you can do that. So, we've written a huge SDK so that it's easy to do, which you can see online.
09:42
And so, when you save as, it seems easy to save as, right? But I'll just explain to you how it works. You know, you do a get and you do a post and that's kind of easy for us. But if you want to save it as something else, that's more tricky. So, but yet people kind of want that if they're editing a document. They, you know, often people load a document and they continually save it as through its lifetime.
10:01
So, the document you get actually started in 1995, you know, and it's been saved as ever since, you know, with a nice template and the WMF preview of Windows metafile in the top right corner and all of that good stuff. And often we see the macros in it. I mean, we're analyzing government macros and we routinely see like the Windows...
10:21
The Office 95 macro API had a compatibility when Office 97 arrived. And we're still seeing that in macros, you know, word basic dot something, just extraordinary. Anyway, save as is used, so we should do that. I talk very quickly, who has got lost? Has anyone, you know, no, I'm sorry.
10:41
Okay, so we need a file picker. So, how are we going to get that? Well, I mentioned this check file info thing that tells us what you can cope with. And if you say, well, we can do insert remote image and we can do this write relative thing, then we'll send you a post message when you click save as. We send a post message outside the frame saying, hey, we want a graphic from somewhere, you know.
11:02
And then you can do what you like. You pop up your nice file picker, come, arbitrary image creator, ASCII art, you know, whatever thing, we don't care. And when you're done, send us this action insert graphic and just a URL to it and we'll put it in the document. That's kind of cool. Or we'll save as and reload and, you know,
11:21
create a window for the new document. So, that's really useful and easy to do. And we're using, I think, the same hooks for our new export stuff. So, there's a whole lot of work in accessible PDF creation and PDF UA. I mean, look at all these options. I mean, I hate options. But, you know, apparently you need all of these. So, we've added loads of them recently and you'll be pleased.
11:42
And of course, Repub, it's very, very useful for accessibility. It's kind of an extended HTML dialect. So, you know, you can integrate easily and get all of this richness suddenly. One of our problems, of course, is that interoperability is really, really key in what we're doing.
12:01
And people really care about that. And one of the challenges we have at the moment is our competition really is not great at interoperability. They're spoilt by interoperating with themselves a whole lot, which is easier. And so, when we save, I mean, we love ODF, right? But if you save in an ODF file... Hey, Neil, it's good to see you.
12:20
And then you sort of download it somewhere else and give it to someone on a Windows machine. Like it's not, they'll load it in Word. And it will completely mangle the document. You know, they even have a big list of the things they break. You know, I don't know if it's a thousand pages. It's a very large document that explains all the things they don't do. You know, change tracking. I mean, why would you want that? You know, what kind of features?
12:42
Anyway, so lots of it is dropped on the floor, which would be fine because obviously that product is awful. But it's sad to be blamed, you know, for someone... I mean, like, you know, the user perception is, your product doesn't interoperate. And you're like, are you sure it's me? You know, like, I don't know. So, of course, if you use the docx file format, tragically, you know, we can interoperate better
13:02
with the other world, which is a shame. But the good news is you shouldn't need to do that. You can use it all online in the browser, and you can feel happy and relaxed, knowing it's an ODF format on your server, and you have a full feature experience. But anyway, I was distracted. Remote font management. So, that's all very good. But if you've ever written slides,
13:21
what you'll notice is that you change the wording of this line here, until it just about fits in and doesn't wrap horribly, you know? And that's great, but, of course, it's highly dependent on the font being used. And if you change the font, be it ever so slightly, you know, the text can grow and then everything looks awful. And, of course, my slides look awful anyway,
13:40
just because I'm a hacker. But other people have beautiful-looking slides. And so anyway, we decided it was very useful to be able to configure fonts, and that brings a whole load of interesting problems. But anyway, to make this even easier, we have remote configuration. So, one of the things that's nice is to be able to deploy lots of images on Kubernetes
14:00
and DemandScale and more and more of these things. But it's a bit of a pain to configure them. And particularly for a large hoster or something, you know, you have customers that arrive quite regularly and add things, and how are you going to deal with that? And do you restart all the... What do you do? So, we have this remote configuration endpoint now, so you can cut a whole load of your config out. And Collabora Online will just go,
14:21
hey, tell me my config, has it changed? And they'll poll it every minute or so and update a whole subset of those settings to make it easier. And one of those, of course, is the font, font setting. So, it's easy to have then a path to font. So, if you have a file sync and share thing and you can manage files, you know, just create a folder and drop loads of fonts in it, and then we'll notice. And, you know, we just get this kind of JSON
14:44
coming out of that font endpoint, this thing we configure in, that tells us where they are, and ideally some time stamps, so we don't, you know, continually fetch them. And then we can just build a whole set of fonts. And in the background, it's very funky, because we have a fork kit. So, you can't fork if you have threads,
15:01
and it's kind of useful to have threads. So, we initialise LibreOffice in what we call the fork kit, whose job is just to fork. And we pre-initialise everything, load our configuration, and then we fork and copy on write huge amounts of our static data. So, if you've started LibreOffice and thought, well, it took several seconds, what am I going to do online? Of course, that's the instant, we fork within, you know, milliseconds,
15:22
and we have a document there ready to load and open. But the problem is it really needs all the fonts, and we really don't want to hand all of those fonts to the child processes, and we really control this very carefully from a security perspective. So, anyway, after lots of work, we now restart this respawn, load files and pass them in and patch lots of infrastructure.
15:42
I was actually just talking to Leonard a few minutes ago about, he was telling me, oh, you've got to mount proc, you have to mount proc, you'll get screwed over, you have to mount devices. Otherwise, something will go wrong in your stack somewhere, and we're like, yeah, well, you know, we tested that, and we fixed and we patched around those things, so that actually, you know, our jails have almost nothing in them, no proc, only two devices in slash dev, no shell,
16:04
you know, they're pretty well locked down, so we like that. Oh, and then I've got a few minutes, so I'll just show you a whole load of gratuitous features we've added, just in case you like features. And the users do, it turns out. So, I'm a big hater of the blockchain, but DevDao actually sponsored some of this work, so we like that,
16:23
and the European Commission as well. So, getting our columns into our spreadsheets, lots of them has been happening, and it's got rid of this very annoying dialogue that plagued lots of users for a long time, so that's really cool. Oh, and there's even the proper credit, well, the NGI ADAPSI. So, the European Union Horizon Research Programme
16:41
is actually really cool, and anyone who knows about it, if you have a good idea, the traditional way of getting funding from the European Union was that if you have a really good idea, you need to find 15 other people across 27 countries, and then try and persuade them all that their idea is the same thing, and then get someone to write the proposal and submit it, and then you don't get it,
17:02
because it's all inconsistent. So, the good thing about the NGI ADAPSI, the Horizon Research thing, was that they said, hey, let's do something that's good for Europeans, and so they would just give money to single vision stuff, and our vision was, well, let's fix interoperability. So, we did a lot of that, and they paid, which is awesome.
17:20
Look at that. Yeah, you know, it just makes life so much easier, doesn't it? And probably good for Europe as well. So, anyway, so form controls, creating lots of nice rich text folders, much better PDF export with creating real editable PDFs actually built into the product rather than having to layer things over the top afterwards.
17:41
Starting to theme colours so that you can select different bits of your document and change the theme and see how that impacts the whole document, and we're doing lots of work here to extend that to writer and calc. Chart data tables, you wouldn't believe it, but these things at the bottom of charts are very popular in presentations. Some people like lines and other people like numbers, and now you can do both in the same thing,
18:02
it's a key interoperability thing. And then also other random interop things, you know, precisely anchoring your images and reflowing your text very nicely in the browser, improving our formula input bar, accessibility checker to try and find problems in your documents for the visually impaired, prettier dialogue functionality, so happening in the client side,
18:21
and lots of this is now JavaScript in the client side to make it more accessible and performant. We completely reworked the tile serving thing, so instead of sending new tiles when things change, we send very small deltas of them, so we find the pixels that changed and then we Z-standard compress them, so we switched from big PNGs with even headers in them and crazy stuff
18:44
to much smaller Z-standard compressed things. Thanks to Facebook, I just need to thank Facebook for helping us all get digital sovereignty back, you know, that's important, you know. Password options, so you can put passwords on your files
19:00
and various attributes, lots I'd mentioned the PDF things, embedded video playbacks if you like that sort of thing. And the last silly idea I think in the few seconds I have left, so we've done a bit of a concept for running Calibre Online in the browser, so when you go through your tunnel, and my hope is that tunnels get better connectivity, but you can then click a button and in theory run this thing offline,
19:24
so we have a prototype now of Calibre Online, if you're interested in that, there'll be talks in the LibreOffice track, which is I suppose somewhere nearby, that then allows you to edit offline, and there are a whole lot of interesting problems there, if you like wrestling with massive multi-gigabyte linkages and horrible nightmares, do get involved with that,
19:42
but there's a little prototype there that'll allow you to work on that and play with WASM, so yeah, come and hang with us, we have Hackfest, so there's a LibreOffice Hackfest, if you're excited about LibreOffice, and you should be, the cool kids are all using LibreOffice technology, come to our Hackfest this Monday and Tuesday, we have a community dinner tonight with pasta at the Business and Technology Incubator,
20:03
and there's a great link there, if you take a photo of it, you'll have it for later, so you can come along, and beyond that, we're running a Hackfest in Cambridge in March 28th and 29th, which is not only LibreOffice but also Calibre Online, and it would be lovely to see you there, if you want to come and stay in a beautiful Cambridge College and wine and dine at our expense,
20:22
and have some team building, and get stuck into your Calibre Online, we'd love to have you with us, so thank you for your patience. Are there any questions? Yes, sir?
20:44
Well, you know, like I say, yeah, when can we expect chat GPT integration with Calibre? I'm sorry, I have to repeat the question. Yes, so this is a really good question, I mean, ultimately, you can select some text, and we can send that to you, and you can send it back quite easily. Yeah, I mean, I think AI brings a whole lot of interesting challenges,
21:04
and I think, I don't know if you've looked at Office 365 and the AI slide improver, which I obviously should have used, you know, it makes your slides look pretty, but the question is, what are pretty slides, and the real problem in AI is, of course, the training data, and one of our problems is that we like this digital sovereign world
21:23
where we don't spy on people all the time to work out what they're doing to their documents, right? So, Microsoft doesn't have this problem, they have Office 365, and they're constantly watching, you know, so they know how to make pretty slides just by watching millions of people go, oh, the colour's a bit, and also offering you options of different ways to break or improve your slide
21:41
and seeing what you choose. So, yeah, I mean, how do we build the data sets to let us do this in an open-source way? And I think AI's fantastically interesting, and Bradley no doubt will come up with the Afaro-Afaro AI license, you know, I'm sure this is happening, because the source is banal beyond belief in most AI, you know, things.
22:05
It's the data and the training data and the model you build. So, yeah, it would suck to be an open-source company, 100% open-source code. It's just this massive binary blob that not even we understand that you have to buy, you know, to make it useful. So, I don't know, we're working on the problem,
22:21
and there are a lot of smart minds thinking about putting AI and keeping that sovereign and on-premise, but I don't have a perfect answer. But it's a fantastic question. And if you want to do chat GPT, we should talk, you know, come and see me. I did mention we're hiring people, I'm probably not supposed to, but we love C++ hackers, if you come and see me. We're growing fast and doing some cool things.
22:42
Other questions? Anyone at all? No? More? Yeah? Well, that's very good of you then. Come and see me afterwards if you want to chat, please do. Thank you so much.