LibreOffice Calc dependency & performance work

Video in TIB AV-Portal: LibreOffice Calc dependency & performance work

Formal Metadata

LibreOffice Calc dependency & performance work
How we made things faste & better
Alternative Title
Open Document Editors - Calc Dependency Performance
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
Trail Multiplication sign Calkin algebra Calculus Maize
Web page Group action System call Computer file Wage labour Execution unit Calculus Perspective (visual) Neuroinformatik Number Spreadsheet Roundness (object) Read-only memory Semiconductor memory Well-formed formula Different (Kate Ryan album) Operator (mathematics) Single-precision floating-point format Core dump Representation (politics) Gamma function Euklidischer Ring Form (programming) Block (periodic table) Cellular automaton Forcing (mathematics) Sampling (statistics) Counting Bit Surface of revolution Cache (computing) Type theory Software testing Summierbarkeit Right angle Quicksort Block (periodic table) Arithmetic progression Abstraction Row (database) Reverse engineering
Arm Cellular automaton Calculus Bit Area
Area Cellular automaton Execution unit Range (statistics) Virtual machine Maxima and minima Calculus Rectangle Symbol table Area Number Data stream Category of being Broadcasting (networking) Spreadsheet Well-formed formula Pattern language Right angle Quicksort Writing Row (database)
Group action Presentation of a group Scripting language System call Copula (linguistics) Parsing Serial port Latin square Multiplication sign Equaliser (mathematics) Range (statistics) Execution unit 1 (number) Numbering scheme Set (mathematics) Calculus Parsing Insertion loss Computer font Food energy Dimensional analysis Area Formal language Medical imaging Mathematics Sign (mathematics) Bit rate Semiconductor memory Different (Kate Ryan album) Core dump Office suite Resource allocation Physical system Scripting language Parsing Algorithm Constraint (mathematics) Electric generator File format Constructor (object-oriented programming) Latin square Electronic mailing list Bit Maxima and minima Measurement Sequence Virtual machine Connected space Type theory Process (computing) Vector space Right angle Quicksort Reading (process) Resultant Writing Spacetime Row (database) Mobile app Functional (mathematics) Table (information) Computer file Wage labour Patch (Unix) Computer-generated imagery Calculation Cellular automaton Branch (computer science) Distance Discrete element method Rule of inference Revision control Broadcasting (networking) POKE Spreadsheet Cache (computing) Well-formed formula Profil (magazine) String (computer science) Representation (politics) Mathematical optimization Punched card Metropolitan area network Data type Information Cellular automaton Projective plane Code Counting Line (geometry) Limit (category theory) Similarity (geometry) Mathematics Personal digital assistant Game theory Table (information)
System call Ring (mathematics) Execution unit Sampling (statistics) Calculus
Functional (mathematics) Combinational logic Calculus Line (geometry)
Area Medical imaging Spreadsheet Group action State of matter Logic Real number Weight Sampling (statistics) Calculus Form (programming)
Well-formed formula Calculus Pattern language Line (geometry) Student's t-test Number
Point (geometry) System call Computer file Code Multiplication sign Source code Execution unit Revision control Term (mathematics) Semiconductor memory Googol Endliche Modelltheorie Traffic reporting Address space Physical system Area Dependent and independent variables Email Dialect Maxima and minima Bit Line (geometry) Process (computing) Vector space Logic Personal digital assistant Right angle Table (information) Window
were so we the village the talking things because they put political time so when you talk about calkin dependency tracking and all these things here and earlier if you got so for the
Revolution I submitted to talk abstracts 1 accepted
silicon you get 2 for the price of 1 that the American I'm saying yeah ancient so if the they sell account is being systematically refracted in the last uh wound 2 years and we made some huge progress the it used to look like this there's someone like Mr. you colored in a pretty way but let me assure you it's a disaster some say so we used to have this this column and every cell in your spreadsheet was scattered somewhere randomly memory like miles away the walking along this way or this way it would be a sum of boasting war rounds of memory problem cache is everywhere a big big problems and lives of stuff was shoved into itself and will cost those in a bit of text which cell typescript types from an subclassed into all sorts of different things including any in T cell which still dependencies it some independent so like in the in the cell so up with special anti self anyway sinful . 2 while ago we we refactor this'll to make a very nice and beautiful and clean and actually splits much of the core of count out into single and EDS and we now have nice so the long linear chunks so if you have doubles numbers all way down a column in excel or rather I know that when we pulled the files from Excel and this will turn this into this beautiful representation chemical say you had a million doubles in a row you get an 8 megabyte block full of bubbles and this is very nice with computation perspective you will be adding these up is very very simple do wrong endorsing laborer memory as you move down this road and doing crazy stuff with the switch them so we can get a whole lot which is great anyway there's nothing new that was here a year ago anything we did the walls to try and share formulae form of say previously each cell had and copy of a formula so often in spreadsheets you'll fill a column down so you have in a hundred thousand or whatever formulae and they're all basically the same formula just copy and pasted down do the same operation on data and we would call the uh the formula tokens twice so each of these again is allocated at all memory and you have that still happens but now we only do it once save that all of the cells that are similar in the group and we have a single thing right represents the formulae and reverse Polish annotation page reversibility force so that's helpful through the interpreter and that I had a huge impact on their usage so we went from this is the sample document has some some formulae the units of the them plus savings that this quite useful for the
still a whole lot more to do again so we d belonging to 1 of the most stupid of bits so account and yet this whole more informed when free but we had dependency problems also all of this work by the dependencies based again herself so each cell arm would look at the other stuff depends on so many dysphoric respiratory quickly in Namibia well understand this more easily than the so a
spreadsheet here and we have some data here very imaginative data streams and by the public and you have something that symbol here like the some uh we're just like you can elect right this guy at the top here depends on 3 cells of so it is left in the late and we still is down from each of these guys would then use it in the middle here you see that the rows slowed down so we have the sliding rectangle but is dependency for all these when I change a number here I like coming here in writings of to you know then we have to obtain these 3 guys and if you notice that was these 3 guys here of the young and when this is represented before with extremely extremely unpleasant
Sarah basically pool for each of these guys are you have a separate range was sort of nailed into a very complicated area a broadcast of slot machine it's kind lucky in evil handling Independencia calculate some but will also these 1 for each of the cells a down column but you looking at it me looking at it as human beings guy well it's obvious right is a serious pattern going on here right you can see the pattern in the formula for the wine off instead of with this once go in and so when way instead of storing unit hundreds of these areas and then trying to work out which 1 it intersects with when you change data and what's urinate . warning we instead school
and 1 wet with which is kind of obvious right and then as we get a notification we know which has change from when they fight and we can look at the formula we can back projects references so references a bit like a vector in the vector in 2 ways and so you can work out which bits to change recalculate here so by using a slightly clever algorithm you can save a the huge amount of memory loss of linked lists lots of these connections between listeners and a broadcasters and maybe just really a whole lot simpler we kicked yeah well In the end go in the end game it should be I'm say yet and was actionable cost and it's not all like in the case of the the the notification bypasses legacy listener of low-cost save time space saving the copula launch formula group if you manipulating 7 spreadsheet around is false some favorable with them for so as much of time and space saving room whole live of quite large wins here so we had about threefold saving for copying and pasting large motion the in the world of another big Parliament was script type of optimization the so 1 problem spreadsheets working out how high rate rose all so you think that was eating but you have to measure text text is quite complicated things and uh the font size unfortunately is also complicated the performance depends on what kind of text it's text on which that Latin text and so on and so you you have the text what string tie these thing from Office Online this premise of before you can even in measure the text which is I should hasten to add that the dimension you I did not know that so if I can take little but he is not here the the summary so you um yeah so you by red patches have have stated that don't don't think I didn't think it on anyhow operate so that that helps you measure revise is going produce longest-running energy unity of that 2 figurative right side on and we will we were catching the result of that was simply not copying around so when you copy it states around would think if we re-calculate all of these all these things which is we quite expensive some so need we made that come around with the data which helped a lot will save some that for example if you have doubles uh then you know the language the doubles and you really need to do this very complicated so the double a string and then trying to take with will spring invasion by the it's probably not going to be right in it for we can adjust the speed of and save it was really quite amusing really stupid stuff there some the we just to recap so now you got and and you're you remember now down the columns we can tell was a whole block of like the a million doubles and you can look at the format only goes to general format drink in script type rule and we can stick in a span a range and it's just the same type the whole thing was like 1 in have what assumes that on same sensing waste wings there in size and who and what is really useful not just because it's could have a quick snappy thing because will love mobile app in a and then they want to have any memory for a given any CPU either so it's a bit of the constraint to help them a chart optimization save every time the charts dependency range was changed but we would argue reconstructing and rebuilding the whole job to tell a whole thing down and reconstruct it will do the job that we know of and is extremely slow and we're doing that and so you know if a child you couldn't even see you know on sheets that were no with even with the so now I think delay distance away from visible which helpful on so that some of the the the core infrastructure count of 7 this system this is most officers from 4 . 2 when treated a year ago there and we're getting some of these numbers from from printed pretty fast causing of fast in the false positive we discovered that we inherited from the old ways of doing this name whenever you see it fosters for being not very far fast from the so say for example on this a very quick to equalization scheme so you take a nice XML so we wouldn't pathways expensive strings around them and this is a good idea of what is not a good idea is duplicating the string into a new in sequence before you pass it to the taken writing reading and then getting back in and then inferring away again on so this completely wiped out any marginal benefit of that I'm so really we fix the whole would stuff we we started the see because we split this out interesting Fred so we had a friend during the parsing antagonizing fine you can actually profile of see some saves embracing influence you reference referencing is a popular office suite of the spreadsheet and an annual adjust the way we quicker than that of leading their informants so that's excellent text files of England for customer often represent a sort so quicker than previous calibration manages the 2nd 1 was I'm so
in full full how we we started to look not sort of labor saying and of single the fast serializer which have been looked at before and again FoSTeS not a good sign of then it turned out fast serializer was doing individual Siskel writing like Shevrin novel on the table and on the uh on the table cell and so on just the right this simple string each of those going into the Linux you're writing straight and Wendy's even slow I mean when the system simple more expensive fly-half if more accurate free so this is just a really really dumb stuff some really nice thing and that there was some buffering improvements by reducing the allocation branch of that a string ification actually fall which is good that's a saving of the big thing that the thing that that's not just count this'll slippery fellow of pieces paralyzing save the basic construction of an idea file or a maximal upon is the I'm a great loss of maximal strings and we want a shovel and this the which involves compressing them the so that's the profiles here I know lots of time spent compressing repressing the strains the ones who created them all we can then present in parallel so a magistrate holy thing to paralyze zipping these things as if a certain she that that makes quite a difference if have lots of large files into the squash down a such off when people temple . 4 it turns out we made stupid thing we were doing was impressed so interest proportional load-time deflating some big things we don't big things well but it turned out we were recompressing J. pegs almost all day pegs already heavily compressed and Huffman coded and when user them they get bigger and the the flight is not perfect version of a room and said what we were doing was was spending a large chunk of all time compressing already compressed Hoffman units is another inherited a inherited feature from the past and we're going 0 it's bigger and throwing it away again so we added like some to line fixed of the sets of an image that they don't do something stupid with and again a very nice of being away from the center when full of some presentations which is which is pretty good on and much much of a whatever benefit of reading which is a shame I'm still some benefit we could get here by parallelizing the the the the uh the thing with the actual generation of XML and you can do a whole lot more than month before the much to do that even tho they sheets I have 1 single big she was quite a lot of spreadsheets have just 1 the goal of data and some around giving rise to the top what else it selects rose to this is the man had mad function that that's a xl re represent things in rows and we repeat them in columns if you to the size of spreadsheets and limits it's like a broom handle in a way that it did is very whole toll on very thin and the uh and say representing in column thing makes sense of what the other side right it represents the other way and so we have to do some kind of some unpleasant trying to work out what the best for sorry yeah for various race that is prominently tried in the thing and I was pretty slow and we've heard about the muscle better than just switching that stupid inefficient still things to slightly less inefficient ones but in itself going someone so that's my talk I have labor wrong horribly title since that time motion offices resell them and we should not only that but for some reason I'm I might make if a time I team of very kind of any now his the punch line it's continues to improve a lots of carry refactoring on representation improvements was plenty more to do if you're interested to to Marcos is also a they're all I can do pointing somewhere not in here and there yeah cannibal come see me who have you to thank the the
2nd time and
all ring questions while I'm unit questions different the the the the following is generally given samples and the yeah the the now you can select on the that's that's quite important I'm together the
performance of functions this is some I think this is the so tell them because they quite similar on so you know you can see the lines with the
various combinations of aliasing and using it the capsule and the if yeah this is this is deliberate again since in this has this has low aligned line can hear all bottom whatever some but we want to test all combinations thereof that
I off examples the form of images on the beautiful in the in the states in this
whole area from scaling interpolating whatever logic you want to do really quickly would you use a get some sample weights play the CEO make sure work for nouns aggression anyway sorry I got distracted there we we we consolidate because you want performance when yep yeah yeah that's a real pain action because when the D falls someway this spreadsheets working so you um this even
if you have the same thing as saying that a little data on a whole line of numbers are year formula like this and in such a role in the middle uh typically it's broken it right you can't not break the on pollution is you and so these things for you so you know we we split this formula grievance to now we do that we must inevitably you pay the same formula and we be sold it is really lame but like at i xi pretty cool but anyway right who's next I did was I and I think of what we know that so that I can market the and what do we have the long George Marcus he loves patterns wasn't so your student you can do this cool new feature this summer and a paper market is called and the and the yeah but if you want
to go and then stop you know again we should we should get a move of the importance of
binary the teachers no more candidate under question but inadequate the hassle of response so the kind and the who here so we just break following driven work and this is the customer who wanted to foster and you may have been too slow and we so on will accept and what will anyone judging models we wanted to be faster than Excel the 3rd point will make use often expect this approach and blow it was often excel and you know for all gosh the source of living fruit in various areas like lead time now we need to you come with XML because a job problems a relatively small of this same it's your time if you are looking at we have some great ideas be you can you prioritize how you how you wish so most people work more they won't work on some we knew who stuff of the 20 the customers the the bread on the table children looking happy so we we would tend to do what they want and they they they bring regions of so in this case S customer-driven come from a simple ownership us we work with and say that you have problems with your and so on so that that are heavily data-driven profiled Trinity all lines in terms of their did to try work out where we can get the maximum vectors good I think extinguished the following I'm sorry 1 logic here the the but the and that was that was what so the system and what yeah and can in lieu of like 1 of the line here FIL this musical yeah that 1 the what since you say when we did the even still work we said the got between 30 and 500 times faster the problem is it's it's very dependent on the data well if you have a date file with 1 . 2 minutes the lead time is minuscule anyway hard to measure and dominated by re rendering the to what like when you would say you know where where is if you if you have a million writers on then we can probably already did that in Excel if it's not a simple simple bullet struck what what you need and again this is qualifications on Windows we have a 64 bit code yet that's happening the next release so then you have addressable memory concerns that advertiser very complicated answer right but if you know if you want support so that you can use your xl things we look to see real data of the city was became I come from a self-exciting revert follow but maybe mail it to me moving conflicts and at from but unexpected things like unit of the bank of the company OK thank you so much and and wine here there's only 1 thing I want you credit dismissed as well hiding would you it you know who were behind the and reports which uh I think you're that is also so it just stand up and give given with the item on the of the time the time