How to migrate from PostgreSQL to HDF5 and live happily ever after
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 159 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21188 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 2016159 / 169
1
5
6
7
10
11
12
13
18
20
24
26
29
30
31
32
33
36
39
41
44
48
51
52
53
59
60
62
68
69
71
79
82
83
84
85
90
91
98
99
101
102
106
110
113
114
115
118
122
123
124
125
132
133
135
136
137
140
143
144
145
147
148
149
151
153
154
155
156
158
162
163
166
167
169
00:00
Human migrationProjective planeExtension (kinesiology)Point (geometry)View (database)Condition numberInformationUMLLecture/Conference
00:49
outputData modelSource codeEmulationHuman migrationMappingNumbering schemeWeightProjective planeProduct (business)Open setShared memoryLevel (video gaming)Universe (mathematics)Acoustic shadowUML
01:37
Product (business)Level (video gaming)NeuroinformatikProjective planeResultantSoftwarePoint (geometry)Physical systemInternetworkingFrequencyLecture/Conference
02:22
Multiplication signMeta elementPresentation of a groupAuthorizationCore dumpEndliche ModelltheorieSeries (mathematics)Software frameworkWeb 2.0Group actionImage resolutionProgrammer (hardware)PhysicalismOpen setSimulationBitMultiplicationOrder (biology)CalculationFlow separationSource codeMathematicsInheritance (object-oriented programming)Metropolitan area networkMethodenbankPhysicistSocial classCodeSoftware developerDatabaseGeneral relativityData compressionComputer engineeringMappingBounded variationSemantics (computer science)CASE <Informatik>BuildingDimensional analysisACIDComputer configurationCodecArithmetic meanDomain nameReverse engineeringVideoconferencingTheoryStudent's t-testOperator (mathematics)WordCausalityRepresentation (politics)Computer programmingContent (media)Staff (military)QuicksortLecture/Conference
06:27
CodeContinuous integrationSoftware maintenanceSoftware testingProduct (business)Customer relationship managementPhysicsComputer programmingDecision theoryProgrammer (hardware)Source codeComputer architectureArithmetic progressionParameter (computer programming)Social classVector potentialData storage deviceCartesian coordinate systemComputer programmingWebsiteEndliche ModelltheorieConnectivity (graph theory)DatabaseSource code
07:07
Enterprise architectureCodeFlow separationEndliche ModelltheorieRelational databaseQuery languageSystem callPunched cardAreaLecture/Conference
07:59
SimulationNumerical analysisPoint (geometry)Human migrationRewritingRight angleCoordinate systemProcess (computing)Point (geometry)Numbering schemeComputer animationLecture/Conference
08:52
DatabaseArray data structureNumbering schemeRelational databaseIcosahedronObject (grammar)ArchitectureSemiconductor memoryRelational databaseProgrammer (hardware)Process (computing)Task (computing)NeuroinformatikAuthorizationException handlingDatabaseComputer simulationWeb 2.0Computer animation
09:26
Array data structureArmNumbering schemeRelational databaseDatabaseCodeTable (information)LogicArchitectureWeb applicationStaff (military)TupleDatabaseArray data structureData conversionArchaeological field surveyEndliche ModelltheorieCompilation albumWater vaporLecture/Conference
10:20
DatabaseLogicCodeTable (information)ArchitectureRight angleWater vaporInheritance (object-oriented programming)Multiplication signArray data structureTable (information)Data storage deviceFood energyDatabaseLogicTerm (mathematics)Lecture/Conference
11:03
CodeComputer architectureGoodness of fitLecture/Conference
11:35
Physical lawSoftwareRewritingEwe languageDampingMultiplication signMonster groupLatent heatProgrammer (hardware)Relational databaseProcess (computing)Reverse engineeringMathematical analysisRight angleCodeUMLLecture/Conference
12:48
Field (computer science)Multiplication signPoint (geometry)Sampling (statistics)Social class2 (number)Workstation <Musikinstrument>NeuroinformatikDatabaseDiagramLecture/Conference
13:41
Computer configurationDatabaseSocial classMultiplication signSemiconductor memoryVirtual machineExecution unitNeuroinformatikLecture/Conference
14:31
2 (number)Multiplication signOrder of magnitudeNumbering schemeError messageCumulantResultantLattice (order)SummierbarkeitDiagram
15:28
Insertion lossOrder of magnitudeDatabasePrice indexNichtlineares GleichungssystemCycle (graph theory)Arithmetic meanPower (physics)Endliche ModelltheorieLinearizationLecture/Conference
16:06
Order of magnitudeOrder (biology)Arithmetic meanNeuroinformatikField (computer science)Normal (geometry)2 (number)MereologyINTEGRALMeasurementRelational databaseObject (grammar)ResultantDatabaseMultiplication signQuery languageSemiconductor memoryBoss CorporationCASE <Informatik>Insertion lossStandard deviationStatisticsSpacetimeState observerComputer animationLecture/Conference
18:14
ArchitectureRevision controlFrictionDivision (mathematics)CodeHazard (2005 film)Artificial neural networkSoftware developerData miningMathematical analysisPhysical systemSource code
18:59
System callFile formatFrustrationMereologySheaf (mathematics)Exception handlingCalculationBitSocial classTable (information)Multiplication signMoment (mathematics)EstimatorArithmetic meanOrder (biology)DemonLecture/Conference
20:20
Set (mathematics)Axiom of choiceCodeConditional-access moduleObservational studySoftware testingCASE <Informatik>CAN busConcurrency (computer science)Physical systemMultiplication signReal numberCellular automatonMeasurementQuery languageEndliche ModelltheorieData storage deviceState of matterComputer scienceInformationLecture/Conference
21:07
Strategy gameMathematicsHazard (2005 film)CodeLine (geometry)Query languageStandard ModelProcess (computing)Software developerDialectFundamental theorem of algebraMultiplication signDesign by contractDatabaseComputer animationLecture/Conference
22:23
Human migrationOrder (biology)Query languageMathematicsMechanism designDatabaseHuman migrationQuery languagePlanningWater vaporMeasurementMathematical analysisScheduling (computing)Process (computing)CalculationAtomic numberDivisorSoftwareEndliche ModelltheorieLevel (video gaming)Formal language1 (number)Multiplication signCodeBitNeuroinformatikOrder (biology)Order of magnitudeLecture/Conference
23:51
WebsiteVideo gameQuery languageRevision controlDecision theoryDrop (liquid)Simplex algorithmArchitectureUniverse (mathematics)Database transactionQuery languageWindowComputer architectureMathematicsOpen sourceCustomer relationship managementComputer fileRight angleMultiplication signMoment (mathematics)Computing platformWave packetSoftwareSocial classPoint (geometry)Configuration spaceSlide ruleSinc functionVariable (mathematics)Arithmetic meanAreaEndliche ModelltheorieInformationDatabaseSoftware testingFunctional (mathematics)Installation artMonster groupJSONLecture/Conference
26:20
ArchitectureFingerprintWritingSoftware repositoryInformationFunction (mathematics)BlogConcurrency (computer science)Reduction of orderIndependence (probability theory)Level (video gaming)Parallel computingComputer fileCalculationCalculationRelational databaseCompilation albumStaff (military)Multiplication signData storage deviceResultantTable (information)Stopping timeMetadataInformationInstance (computer science)Function (mathematics)LoginProcess (computing)Pairwise comparisonSoftwareForm (programming)Observational studySource codeLecture/Conference
27:01
Level (video gaming)Social classDifferent (Kate Ryan album)Asynchronous Transfer ModeSingle-precision floating-point formatGeneric programmingCommunications protocolCodeSerial portWindowCivil engineeringSemantics (computer science)MathematicsSource code
27:44
Multiplication signLevel (video gaming)WindowWordLecture/Conference
28:16
Physical systemSoftware testingObject (grammar)AreaArchitectureSoftware testingExecution unitPhysical systemTable (information)
28:49
DatabaseFunctional (mathematics)ImplementationUnit testingComputer graphics (computer science)Multiplication signSoftware testingMathematicsPressureEndliche ModelltheorieNumbering schemeConvex setPlanningAverageType theoryMathematical analysisDemosceneGradientMassGraphics processing unitCompilerQuery languageWave packetStandard deviationExecution unitMonster groupInterior (topology)Lecture/Conference
30:52
Read-only memoryCodeIntelBefehlsprozessorCompilerCodePlanningGravitation
31:25
Boss CorporationProcess (computing)MathematicsBuildingFrequencyBayesian networkMonster groupCASE <Informatik>Boss CorporationMultiplication signMeasurementDistanceAreaLecture/Conference
32:12
Multiplication signMathematicsNetwork topologyTask (computing)Computer programmingRepository (publishing)View (database)Insertion lossForestMathematical analysisFreezingArithmetic meanLecture/Conference
33:21
Read-only memoryHeat transferHuman migrationDatabaseNormed vector spaceNetwork topologyElement (mathematics)Numbering schemeDatabasePoint (geometry)Lattice (order)State of matterSemiconductor memoryComputer programmingEvent horizonCycle (graph theory)Sound effectMultiplication signNeuroinformatikSummierbarkeitOrder of magnitudeParameter (computer programming)DampingLecture/Conference
34:44
Read-only memoryHeat transferHuman migrationNormed vector spaceDatabaseElement (mathematics)Network topologyReduction of orderLevel (video gaming)Concurrency (computer science)Semiconductor memoryInterface (computing)CodeMathematicsMultiplication signConnectivity (graph theory)Social classProcess (computing)Group actionComputer animation
35:48
MereologyWave packetComputing platformSheaf (mathematics)FamilyLecture/Conference
36:26
Software testingOpcodeQuicksortWebsiteSoftware testingMultiplication signConservation lawControl flowLine (geometry)Figurate numberLecture/Conference
37:42
Line (geometry)CodeNumbering schemeMereologyEndliche ModelltheorieRight angleSlide ruleEvent horizon1 (number)Library (computing)Formal languageMultiplication signDiagram
38:49
CodeView (database)Staff (military)Field (computer science)TheoryLecture/Conference
Transcript: English(auto-generated)
00:00
ago about that you should never write a project because normally it's a disaster, but this was an exception, so I want to tell you how it went and what I learned from that and also want to give you some advice
00:21
about the political issues I faced because I started this project as a newbie and essentially I do the right everything and it's not necessarily easy even on political point of view, okay, because I will talk about that.
00:41
So this work was done, this was at the JAMA Foundation that probably you don't know about the JAMA Foundation, but is a non-profit foundation and we do research about earthquakes. So essentially we have the guys
01:00
that make that nice colored maps that you see. These kind of maps. Here it doesn't look too good because of the projector, but anyway, this is the map of Europe.
01:21
This was a project called Share Project. Was done at the ETH Institute in Zurich and essentially was the first project where the open quake engine, which is the name of the product we do, was used to produce this nice colored map
01:43
and when they came at the JAMA in nearly four years ago, 2012, this was already done. So we already have a successful software project which was able to produce this, which is a result of very big computation
02:02
because there are 150,000 points and you compute essentially the other courses. So the probability you have in an earthquake and within a time period. And so I did not start the project.
02:22
I mean, I arrived and I had to learn everything and I had to change everything also. So now I want to give a bit of a presentation about me. So I'm actually, I'm a physicist because I come from academia too
02:43
and actually I did my fair bit of a general relativity in the past. I also want to build theory, the real stuff. But then I switched, I switched a career. I found Python. I liked it.
03:00
Maybe you read some of my papers. If for instance, if you have studied the multiple inheritance in Python, I wrote an article about the C3 method resolution order and other series of papers, some meta classes with them in math. And this was, I don't know, 12 years ago
03:21
when I started with Python. Then, because at the time I was still a physicist, Python was my hobby, so I had the time to write papers and then I started working. So my contribution to Python became a little bit less. I'm also the author of the director module that probably you are using if you don't know it
03:41
because it is a dependency of SciPy. I think it is also a dependency of IPython. It is a dependency of several web frameworks. I'm a pilot, so that depends. Even frameworks that I discovered, I don't know, I don't remember, there was a framework that were doing decorators in the same way I was doing it.
04:02
So I looked in the source code and said they had the same idea. And then in the core source code, I found my code that there was a comment saying taken from Michaela Simina to decorator module. So probably you are using this code. It has a thousand, a hundred thousand downloads because it's using the framework. So I'm very happy with that module.
04:21
Anyway, so as I said, I started working as a Python developer. So the beginning with Zoopaplone, which is an experience I don't recommend. And then worked for seven years in finance. We were doing risk calculation of financial risk.
04:42
And I'm doing risk calculation of earthquake risk. Surprisingly enough, the terminology is very similar. We have the assets. Before the assets were the options, the options, now the assets are buildings. And we have to compute the damages, something disastrous happens,
05:00
and it's actually very similar. So we had an engine, a conventional engine, and now we have an engine for the earthquake. So not so different at the end. And I arrived at JEM, as written there, in October. And after a while, I became the man in charge of the open quake engine, which is the simulation engine,
05:21
the computational engine that really given the models, the seismic models that the scientists give us. We produce these maps.
05:40
I'm doing this presentation because remember this morning, Gail Varoco made this beautiful talk about the separation with the two worlds, the scientists, one side, and the web programmers, and generally the programmers on the other side. And I made the transition between these two worlds
06:04
because I came from physics. I spent 10 years doing web drop, I mean database development, this kind of stuff. So I understand very well what he was talking about. What is true, there is a separation,
06:22
and this separation between the worlds cause problems. Because essentially, in our case, we are this scientific application, but the scientists didn't want to write this amount of code, especially because this has to run in a cluster, concurrency, data storage, or kind of problems, so they hired some programmers.
06:46
And they left the architecture, the decision of the architecture to the programmers. And they decide to do everything with the database, which was the source of all problems we had, performance problems. Everything was impossible, essentially.
07:02
And what they did in these three or four years was to drop the database, essentially remove the database, as we will see. And yes, so I know very well the scientific world because I come from that, but also I know very well the business enterprise world.
07:21
And I'm generally, I'm an old school boy, so I like not much the SQL, but the relational model. I like relational databases, okay? Spent several years working with SQL with the query analyzer in Postgres, doing big stuff on databases, and I like that stuff.
07:41
I'm very, so I'm very old style, I still use index. And generally, I don't throw away old code just because it is old, this was my punchline. But sometimes we have to do that. So I'm going to talk about what happened
08:01
and why I was forced to do essentially a total rewrite of the code, okay? Something which I never did, because I always work with the legacy code, but typically what you do, you rewrite small portion, you reform up, you clean up, you do small things. You don't throw away everything, okay?
08:20
Because that would be foolish. But I was forced to do that, and we tell you why and what I learned. So this was my first reaction when exactly four years ago, I got a job interview with this foundation, and I was discussing with the guys there.
08:43
And they told me, you know, we use Postgres to store our floating point numbers essentially, bigger arrays, and I was very surprised. They said, well, never heard anybody doing serious numerical simulations with storing everything Postgres.
09:02
Well, I thought maybe the database is just, maybe they do the computation in memory, then the database is just for storage, it's not really, maybe they have a good reason to is that. On the contrary, there was no reason at all that they used Postgres,
09:21
except that they were programmers, web programmers, so they knew Django, so everything was done as a web application. And there were really, really ugly stuff, like this, like a Django adapter to convert NumPy arrays in PQL object,
09:42
and then they were stored in the database. Really ugly stuff. And since this was a kind of distributed calculation, we had these bigger arrays coming from the worker nodes, so you have a central node, the master node,
10:01
where the database here, you send this data to the, actually, at the beginning they were not sending the data, actually the workers were reading the database, doing the computation, sending back big arrays, then all these arrays arrived at the same time, and they were locked in the database, so that you could read from the database an array.
10:24
Combine this with the array which was coming from the worker, doing some multiplication, things like that, and store again. And everything was locked, of course, because you have, I don't know, 500 workers, 500, by construction, they arrive more or less at the same time so you are waiting, so.
10:42
And then you have 100 workers who work writing on the same table. It was really a nightmare, and the worst thing was there was no way to decouple the database logic with the scientific logic because everything was in terms of Django models, okay?
11:04
Then what do you do in this situation? I write everything, so I arrive, they said here, the only way would be to remove everything. But what do you do? The code base actually was good in the sense, lots of tests, lots of doc strings,
11:23
more or less well-explained, et cetera, but the architecture was totally bad in the sense that it was impossible to get performance for that. And also, of course, I knew that if you need to rewrite something, it takes a lot more time, the writing from scratch.
11:42
Because if you start from scratch, it's easy. You have somebody gives you the specifications, you know what you want to do. You don't have, you decided the technologies you want to use, you don't have many constraints, you start, you write. But when you have an existing code base, you must do reverse engineering.
12:02
You must read what was there, try to understand why it was there, then you ask for questions, and you ask them why this, that, and they can tell you, ah, no, I don't know, this was written by programmer X, which is not a body, not anymore here, because change of job. Or this was acquired by this scientist,
12:21
but now scientist is in Peru, things like that. So it's a lot more complicated to rewrite than can write. So didn't want to do that. Why I say there is no choice? Because after a while, I did some experiments, I measured some things,
12:42
and I discovered pretty early, actually in the first months I was there, that you had problems like the one you can see there. So essentially we were, in one of our calculators, we were producing ground motion fields. Now motion field is the shaking of the earth, okay?
13:01
There is an earthquake, and you say, ah, do you see the peak ground acceleration for instance? You can compute this ground motion fields in all points, you have, I don't know, grid of points at some point. You compute the ground motion fields, and the time to compute the ground motion fields could be this, but the time to save
13:21
the ground motion fields could be a lot bigger, actually it was a lot bigger. So here the computation, I don't know, 4,000 seconds, and this is 40,000 seconds. This is an example I ran on my workstation. Actually the situation on the cluster was much worse than that. So you have a situation where the computation takes 10 times less than saving the data in the database.
13:43
And this was not the worst of it, because actually then you have to read this data from the database, and the real problem was reading the data from the database, because we ran out of memory. The first time we tried this computation on the cluster, we ran out of memory. The cluster was 128 gigabyte times four machines,
14:07
plus half a terabyte of memory. We didn't, was not enough. And then I later estimated that it would have taken more than two terabytes of memory to run that. So it was totally, no way, okay.
14:22
So there were two options, or you don't save the ground motion fields, or you save them in a more efficient way. And you can see here what seems, you may think there is an error here, but as you see, there is only one column. The second column here is missing. Actually it is not an error.
14:42
So this is the release engine. Okay, we have a number for the release of our software, the engine. This was released 1.5, and this was using the database, so it took, I don't know, 40,000 cumulative seconds. Cumulative mean, some mean the time and all the workers.
15:01
It took 40,000 seconds, now with the release 2.0, it takes less than half a second, okay. So this actually is there, but you cannot see it because it is half a second. So there are five order of magnitudes of difference, depending if you use the database, OSGRES, or if you use HDF5, okay.
15:26
So five order of magnitudes. And actually I must say that this is engine 1.5, but when I arrived, it was before engine 1.0, and this equation was much worse, because things I did was for Insta to improve
15:42
the Postgres query, so I did the bulk insert instead of an insert many. I tried all things like remove the indices. I tried all of the stuff that you can do with a database. So this is already very much optimized, okay.
16:01
But if you remove the database, you can gain five order of magnitudes. And five order of magnitudes means that one day of computation becomes one second. One year of computation becomes five minutes. So it means you can do a computation, otherwise it was impossible to do. And I measured this kind of speed ups,
16:20
this normal integral motion fields in another part of the code, and when I measured this speed up, I thought, no, I made a mistake. It's impossible. But then I measured again, and there was no mistake. So this is from the case was there were thousands of queries because, okay, Django was using the object relation mapper,
16:42
so instead of doing a big query, do lots of them, anyway. These kind of measurements are real. Actually, they could be even bigger cases, but I couldn't measure because you get tired after waiting two or three days, and you get tired, you stop. But the larger the computation,
17:01
the larger the cluster, the worse the situation was. Also, if your database was mostly empty, okay, there is some performance. If the database is nearly full, so we added one terabyte to this space, then it performed much worse, of course, because you have to insert inside the database, which is already full, so. And we also had memory problems, et cetera, et cetera.
17:23
So I didn't expect these kind of incredible results at the beginning because I knew about HDF5 because the scientists told me, actually, and I did some experiments, and I saw the HDF5 was at least 100 times faster,
17:41
but I would not have expected 100,000 times. The reason is that when you do the small experiments, small DB is one thing, but in the large case, it is much worse, so the speedup is much better. So I wanted to remove this stuff, but you cannot sometimes, because I went to my boss
18:04
and said, look, there are these problems. He told me, look, I believe you. I am totally convinced that you are right, but the architecture, this application, has just been written, okay? I spent one year with it. I cannot tell the scientists, oh, now we have to rewrite again, okay?
18:23
Because we have to release in six months. And actually, there's another interesting thing that a younger colleague of mine came a few months before me, did an analysis of the code, and told the same thing to the boss, sent an email, but this email, by mistake,
18:42
went in the hand of the developers that wrote this architecture, and of course, they were not happy at all about that mail, so there was friction, and also the split, the teams, sorry, the teams were split, because we had half of the team in Zurich, and half of the team in Padilla.
19:01
I was in Padilla. And essentially, the team in Zurich did the other part. The other part means probability of having an earthquake. The risk part means economical damage. And we were in charge of the risk part. But it is very difficult to do an efficient
19:22
calculation of the risk if the other comes to you in a format which is not a direct one, okay? Because if you have to query tables, they are not structurally well for the risk. They maybe are structurally well for the other but not for the risk, and the teams don't talk with each other, or there are issues, et cetera, that I don't want to go into, but you can imagine.
19:43
So, what I did, I did nothing. For eight months. Nothing except study, call the base, learn what was there. Maintain, fix the bugs, let the frustration grow, which I think is a positive thing,
20:01
because the more you are frustrated, the more you want to change things. At certain moment, if limited in time, at certain moment, you are motivated, the end, things can get done. So, a bit of frustration is also bad at the end. We have the test, we improved the section
20:20
which were not the real problem, but for instance, the first thing I did there was to put in place a monitoring system. So, while the system was running, I could measure how much time it was taking to run queries to do stuff, okay? Even that was not so easy to, okay?
20:41
Maybe after the talk, if you won't ask me, I can tell you some story. And I wrote the XML passing, because the seismic models are in XML format, and it was done badly. The concurrency, I unified it, using my MapReduce everywhere, because it was done in strange way. So anyway, there was a lot done in this eight months,
21:03
but not the real problem. Fortunately, the Zurich team evaporated, and I don't mean this in a bad way, nobody was fired, nobody was fired. But people left. It was made clear that the company wanted to move
21:24
most of development to Pavia, and not to Zurich. And so somebody left, found a better job. Somebody, we also did the contract with DHH, at the time limit, at the end of the limit, we said, okay, if you want, you can come.
21:40
We have to work with us, but in Pavia. And the guy said, I have a girlfriend here. So, you know, the team evaporated, and so we took charge, both of us are at risk. And we started removing stuff. More than 10,000 nano-codes, it was. Lot of stuff, which was really not used at all.
22:01
Okay, we removed, and we decide what to do. We, I mean, essentially me and my colleague, Luigi, the one who wrote the letter, I thought before, and we decide, okay, let's keep Postgres, because we cannot throw everything away. And we changed a lot to the database.
22:25
So implemented the migration mechanism, which was missing. So alter table, change the structure, change the queries. Lot, lot of work, 30 months, more than one year, spent fighting with Postgres. And we had something like one order of magnitude
22:42
improvement, so we had a lot of improvements. But not five order of magnitude, okay. But anyway, in the meanwhile, we had the release, the release one zero went as a schedule, we had the release, we had users. And, by the way, I'm not saying
23:00
that everything was wrong, broken, because some calculators like the one I showed the map of Europe, that one didn't have big performance issues. So that one worked. It was the other one that had big problems. Anyway, we could maintain this software, this code working.
23:21
Finally, at the end, this was in September, so nearly two years ago, these more measurements, and they realized that it was, I could have improved a bit more on the Postgres side, but it was not worth the effort. I can say, okay, I can improve,
23:41
maybe a factor two, three, five, but I need a factor of 10,000 to make this computation possible. So, at the time, I knew that HDFI was much better, because we don't use transactions, okay. The architecture was wrong, because we didn't need a database at all.
24:02
We didn't need to do any queries. We had some geospatial queries, which were extremely slow. We didn't need to do these queries. So, it was totally useless, that advice. So, I decided, let's try to remove that, and we've discussed the Windows porting, because, of course, we wanted to make,
24:22
because we are doing open-source software, the software called OpenQuake Engine. Open, so it means we want, everything is in GitHub. You code the reviews, everything is public, and we want this software to run on any platform, so that any scientist in the world, seismologist,
24:42
can download this and run on its laptop, not only on a cluster. So, if you want to run also on a Windows cluster, a laptop, maybe you don't want to have a scientist to install Postgres and fight with the configuration files, things like that. So, we didn't want the Postgres for that,
25:00
so I said, let's try to write a light version, okay? So, we keep the monster as it is, but we write a small version that doesn't depend on the database. I say excuse, because for me, it was clear from the beginning, that at some moment, the toy model, the light version,
25:22
would have been replaced completely on the monster. That was my bio from the beginning. So, anyway, I did that. It was a lot of effort, because I needed to duplicate, essentially, old calculators, new calculator,
25:40
make sure that they give the same numbers, run the same test. Fortunately, we have a lot of functional tests. But, I learned stuff, which is fine. I made a lot of changes. So, here are some of them. So now, I want, I plan to leave time for questions. So, I will just leave you read these slides,
26:02
and then if you have questions on these points, please ask me in five minutes when I finish. So, there were a lot of changes, and the end story is that there is no Postgres anymore now. There is a SQLite database. The SQLite database is used
26:22
to store metadata information. For instance, I have a calculation. There is a job ID, start time, stop time, description. Then there is a table with the logs on that calculation. There is a table with the output of the calculation, and the table with the performance of the calculation. And that's it.
26:41
All the scientific data, all the arrays, are inside the HDF5 that, if you don't know what HDF5 is, please study it, because it's really impressive software, extremely easy to use, and extremely performant. Okay, these are some of the things I did.
27:02
Again, ask me after, if you're interested in this type, this kind of stuff. And a lot, a lot of stuff. I did a port in Python 3, a port into Windows, serialization to HDF5, a lot of work on XML, because at the beginning we have the XML schema,
27:21
which was not another completely wrong idea. Anyway, a lot of stuff. And I am surprised, because at the end it seems to be working, but still surprised, because with such an enormous change, it went surprisingly well, I don't know.
27:40
I don't know, anyway. It went well in the sense that, wow, we actually have a performance which is really thousands of times better than before. We have more tests than before. The scientists are happy, they can run it on their laptop, works on Windows, works on Mac OS,
28:01
works on Linux, works everywhere, even on the Raspberry Pi, we tried. Works, and works. So I want to tell something about what I learned, that may be interesting for you. This is, I already knew, but you see,
28:22
after this work, I believe in these things even more. So it is really essential, monitoring the system is the first thing you should do. And also, we had the problem with the unit test, with the test, because there was database, in the test you had to create fake tables,
28:40
populate rows, delete rows, do all this kind of stuff, and the test took, I don't know, two hours to run, and now they run in seconds, because if you have an ADHD-5, it's really easy. Also, most of this unit test I removed, the original one,
29:00
because they were testing details on implementation, and if I changed the implementation, I was forced to change the test. And I tried to replace them on functional tests. Functional tests, instead, they are what really the scientist wants, because the scientist tell me, look, with this model, you should get this result, and you give me a CSV file, some numbers, and they compare the numbers.
29:22
That's what the scientist wants. They don't want to test the implementation details. And okay, I learned this kind of stuff. So also, I am old enough now, so I have some white hair, so I feel that I can give some technical advice
29:40
to people, and my technical advice is first do things simple. Don't care too much about performance. Do things simple. When they are simple, 95% of the time are already faster without doing anything, just removing craft, okay?
30:02
And also, don't spend time in complicating your technological stack, because I got a request from people telling me, why don't you try to use numbers? Why don't you try to use graphical processing unit? Why don't you try to compile with inter-compiler, which are all good things, all good things.
30:21
But you know, my problem was the database. First I have to remove the database. Then I can't think of this, this thing. I cannot complicate the stuff. I first simplify. And also, always change the assumptions. For instance, we have this very complex geospatial queries, so we were using PostG's, and at the end,
30:41
you say, but do we really need these geospatial queries? And the answer was no, we didn't need. So I removed totally geospatial queries, and everything was better than before. So challenge assumptions, just because some code is there, maybe should not have been there, okay? And so my advice is, take the most difficult problem
31:03
that you can solve. Here is, this is in italic, okay? So it means, if you cannot solve the problem, wait a bit, other plan B, other plan C, D, E, F, because of course the big problem is not easy. But keep in mind what the big problem is,
31:22
and everything that you do is with a goal of solving the big problem. This case was removing Postgres, the big problem, okay? So I couldn't remove Postgres for eight months for political issues, but I kept thinking about it. And be patient. Also some political advice.
31:42
Yeah, be the boss. You know, when you are a newbie and you come, you say, I would like to change everything, and the boss correctly, I think, will tell you, wait a bit, he's right, so it's not a problem. Take the slow way.
32:00
And every time you do a small change, make sure that the scientists, the users, get a measure of advantage, so you can tell, oh, I did change this change, and now this is 50% faster, and the user will be happy. So the next time you propose, I would like this change, they will believe you, because they knew from experience that what you did before was okay,
32:21
so they will believe you. So it takes time to build trust, because when you arrive, even if you are, I mean, like me, over 40, a well-experienced programmer, when you are a new place, you are a newbie. So you need to take your time. Doesn't mean that you must say yes always. You shouldn't be a yes man, okay?
32:42
You can raise your voice. Sometimes there are technical issues that you can discuss and reach an agreement. Sometimes you don't reach an agreement, because simply, you see, they are not technical. They are, for instance, one thing, I want a single repository, and not three repositories. For me, it was a big issue,
33:01
for the boss, no, but we just, we had a single repository, it was split in three, and now we want to match them together, so understand. By the end, now, we have a compromise. So from four repositories, now we have only two. So not a single, but see, sometimes you can fight a bit. And I discovered several things that,
33:22
essentially here, did some experiment retrieving from the database floating point numbers. So a floating point number for 32 bits, four bytes. Using psycho PG took something like, I don't know, 30 bytes, because you know that the Python float
33:42
is already 24 bytes. Plus add the layer of psycho PG, more bytes. Add Django, more bytes. So four bytes can become 50 bytes. So it can be in order of magnitude. So this was surprising, that the effect was so big. I saw the experience there, that sometimes you,
34:02
you make a design because you don't want to run out of memory, so. And you do that. It doesn't run out of memory, but maybe it takes one week. So your computation runs for one week. You don't get any feedback, and maybe at the end, finally, you run out of memory, or even you don't run out of memory, but it never ends, and the scientist comes and says,
34:23
but why did this stop? So it is best, actually, to let it fail, okay, it runs out of memory after half an hour. These scientists know, oh, what happened? Run out of memory after half an hour, and maybe you put the wrong parameter. So they know after half an hour, that they put the wrong parameter.
34:41
That was the reason. Or, okay, so sometimes it's better to have, it is actually where you use more memory, and maybe it fails, but it fails early, and then you can take action and decide what to change. Also, it is more efficient if you try to do everything in memory.
35:02
Yeah, the other story that I couldn't tell. Oh, these are, yes, let's say two minutes, okay? Two minutes, I will finish. What I did for the concurrency, I decided I liked concurrent futures, so I changed every, because we had some,
35:21
reinvented the concurrency in this strange way, so we had at least three ways of doing concurrency, so I remove all, and I say, let's use concurrent futures. And so used, I mean, used interface of concurrent futures and then write a map of reduce on top of that, and I have a pluggable system,
35:41
so if I am on a laptop, I use concurrent futures, so we were, multi-processing, essentially. If I am on a cluster, we use salary. Everything was tested, the Python 3 part was tested and Travis was, for the Python 2, we were using Jenkins,
36:00
and it's good, wheels are really, really great, because we have this problem that you saw at the keynote this morning. These wheels, especially the many Linux wheels, which are very recent, six months ago, with those, you can distribute your code, essentially, in your platform without problem.
36:20
H5PY has some bugs, strange behaviors, can give you a segmentation fault, so I recommend it, but pay attention. I have some regrets, too, because essentially, for two years, we spent time trying to optimize Postgres, which was a battle against windmills,
36:45
there was no opportunity to win that battle. I should have removed more stuff, especially the test, because, you know, I am old, so I'm conservative. If I see some test, I think these tests are important, so if they break, I fix them. If they break again, I fix them. If they break again and again, I fix them three times.
37:02
After four times, they see that these tests are actually in my way, and I don't need those, and it's better to throw them away and replace them with something else. After four times, I do that. I should probably, if I did it the first time, I would just spend it, okay, it was really great. Okay, and also, I spend some time to port features
37:25
that after discussing with the scientists, I discovered they were not even intended features. The scientists said, ah, but the engine's doing that. Ah, I didn't know about it, so. And there were features well-tested, et cetera, so I expected to be very important, they were not,
37:41
so these are regrets. Still, I'm very proud of this slide, very, very proud of this slide, because our code base is split, as I said, in two. One part is the other lever, which is the, let's say, the low-level libraries, which is still in Python, but it's low-level. It's the part that the scientists work on,
38:02
and unfortunately, I don't see here, but there are numbers here. You don't see the epsilons, but anyway, this is the release of the engine, and the number of lines of code, this is something like for 30,000 to 70,000 line of codes.
38:22
So in three years, because this is three years, from release one, zero, to release two, zero, three years we more than doubled the number of lines of code, and it's fine, because the scientists added more models, more seismic models. But if you look at the size of the engine, which is my part, let's say,
38:40
I reduced the size from 55,000 to, I don't know, 45, so I was able to, after three years of working there, there is now less code than before, and I'm very proud of that. You know that in the intermediate release, there were duplication, lots of ugly stuff, so there was more code,
39:01
but finally, I removed all the stuff, and now there is no more, the old engine is 95%, and now the code has been rewritten. So let's say that I'm happy if it took nearly two years instead of one year that I thought, but. Thank you.