We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

LumoSQL - Experiments with SQLite, LMDB and more

00:00

Formal Metadata

Title
LumoSQL - Experiments with SQLite, LMDB and more
Subtitle
SQLite is justly famous, but also has well-known limitations
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
LumoSQL is an experimental fork of SQLite, the embeddable database library founding everything from Android to iOS to Firefox. As a replacement for fopen(), SQLite is a good choice for singer-writer applications and disconnected, slow and small devices. Modern IoT and application use cases are increasingly multi-writer, fast, high-capacity and internet-connected, and LumoSQL aims to address these very different modern needs. LumoSQL initially aims to improving speed and reliability, by replacing the internal key-value store with LMDB, by updating and fixing a prototype from 2013, and allowing multiple storage backends. Next up we are designing the architecture for replacing the write-ahead log system (as used by all other open and closed source databases) with a single-level store, drawing on LMDB as an example of a single-level store in production at scale. Challenges so far involve code archeology, understanding and updating benchmarking, designing a system for keeping parity with upstream code changes, file format migration and identifying bugs in both SQLite and LMDB. Please do join us in testing and improving at https://github.com/LumoSQL/LumoSQL . In this talk we welcome questions and contributions. This conference has many SQLite users and developers. What do you want to see? LumoSQL is a combination of two embedded data storage C language libraries: SQLite and LMDB. LumoSQL is an updated version of Howard Chu's 2013 proof of concept combining the codebases. Howard's LMDB library has become ubiquitous on the basis of performance and reliability, so the 2013 claims of it greatly increasing the performance of SQLite seem credible. D Richard Hipp's SQLite is relied on by many millions of people on a daily basis (every Android and Firefox user, as just two projects of the thousands that use SQLite) so an improved version of SQLite would benefit billions of people. The original code changes btree.c in SQLite 3.7.17 to use LMDB 0.9.9 . It takes some work to replicate the original results because not only has much changed since, but as a proof of concept there was no project established to package code or make it accessible. LumoSQL revives the original code and shows how it is still relevant in 2019. The premise seems sound. Some bugs have been fixed in LMDB and the prototype SQLightning work. There needs to be multiple backends, initially the original SQLite on-disk format and LMDB and initially for compatibilit and conversion purposes. However the ability to have more backends is very attractive and already there are draft designs for where that could lead. The design taking shape for tracking SQLite upstream may be useful to other projects, where an automated process and can handle most changes that do not change some of the basic APIs. Write-Ahead Logs are in every single widely-used database today, a concurrency model developed in the 1990s and now the only option in both closed and open source SQL databases. There are pros and cons for WALs, but the merge-back model of WALs is a lack of atomicity that becomes obvious in corruption and reliability issues at speed and scale. Databases go to very expensive efforts to avoid this, but combined with a lack of real-time integrity checking in almost all databases, this is a fundamental problem and especially for modern SQLite-type use cases.
Maxwell's equationsGoodness of fitElectric generatorProjective planeInternetworkingShooting methodMaxwell's equationsLimit (category theory)DatabaseComputer virusComputer animation
InternetworkingInternetworkingMereologyVariety (linguistics)DatabaseMedical imagingComputer animation
InternetworkingRow (database)Medical imagingMilitary base
Mobile appInternetworkingMereologyDatabaseConcurrency (computer science)Information privacyInformation securitySoftware developerInternetworkingKey (cryptography)Interface (computing)Data storage deviceOpen sourceArithmetic mean1 (number)Software testingTable (information)Electronic mailing listElectric generatorFitness functionGreatest elementMultiplication signMilitary baseInternet der DingeSoftwareCentralizer and normalizerRight angleCartesian coordinate systemInformationClosed setSequelMetropolitan area networkMultilaterationMobile appEqualiser (mathematics)WeightFlow separationSkewnessArrow of timeComputer animation
Random numberReverse engineeringMaxwell's equationsComponent-based software engineeringPrototypeTerm (mathematics)DatabaseAuthorizationRevision controlBitKey (cryptography)Data storage devicePrototypeFigurate numberConcurrency (computer science)ConsistencyMultiplicationVariety (linguistics)Thread (computing)Connectivity (graph theory)Greatest elementProjective planeCodeSequelPoint (geometry)Host Identity ProtocolSingle-precision floating-point formatMatching (graph theory)Computer animation
EncryptionPersonal digital assistantSoftware testingCodeArchitectureMereologyUsabilityGamma functionVirtual memoryOperating systemSystem callLevel (video gaming)Computer fileSemiconductor memoryUniverse (mathematics)CASE <Informatik>File systemService (economics)Multiplication signBitSoftware testingInternetworkingMoment (mathematics)CodeConcurrency (computer science)Medical imagingComputer programmingWeb pageDatabaseDifferent (Kate Ryan album)LoginMiniDiscInsertion lossPhysical systemDatabase transactionOperator (mathematics)EncryptionCore dumpInformation securityNeuroinformatikSoftware bugException handlingType theoryConsistencyLatent heatState of matterLine (geometry)Scaling (geometry)Open sourceINTEGRALRadical (chemistry)BenchmarkRegulator geneForm (programming)Network topologySequelBlogRight angleAddress spaceCountingVideo gameClosed setMappingFreewareData storage devicePoint (geometry)19 (number)Computer clusterInformation privacyDiagramProgram flowchart
CodeArchitectureImplementationDatabaseAsynchronous Transfer ModeMoment (mathematics)Row (database)Front and back endsLine (geometry)ImplementationACIDInternetworkingKey (cryptography)Data storage deviceCombinational logicExpert systemParsingCache (computing)DatabaseSoftware bugProcess (computing)Data structureComputer architectureSoftware testingMiniDiscLevel (video gaming)Goodness of fitProduct (business)MappingInsertion lossMultiplication sign1 (number)Computer-assisted translationCodeCASE <Informatik>Scaling (geometry)INTEGRALConsistencyBitDiagramTDMASoftwareCartesian coordinate systemGreatest elementMultiplicationSequelMilitary baseBridging (networking)Game theoryStatement (computer science)Virtual machineTerm (mathematics)DivisorTrailPoint (geometry)Web pageCausalityComputer fileLibrary (computing)Square numberEstimatorSemiconductor memoryProjective planePhysical systemSystem callShared memoryDrop (liquid)Computer animation
DatabaseImplementationIntegrated development environmentMathematicsSoftware testing1 (number)Multiplication signSoftware bugHecke operatorMoment (mathematics)SequelScaling (geometry)SpacetimeGodReal numberFormal languagePoint (geometry)Interface (computing)NetzwerkdatenbanksystemInformation technology consultingMultiplicationBitCoefficient of determinationResultantFunctional (mathematics)Physical systemSound effectTerm (mathematics)Video gameData storage deviceProjective planeNoise (electronics)NumberProcess (computing)CASE <Informatik>Stability theoryRight angleDrop (liquid)DatabaseMetreComputer architectureKey (cryptography)Line (geometry)Entire functionPole (complex analysis)PrototypeFitness functionDifferent (Kate Ryan album)MiniDiscLimit (category theory)Revision controlOperator (mathematics)Level (video gaming)CodeBusiness modelExecution unitDiagramConcurrency (computer science)Erlang distributionBuildingEncryptionNatural numberExpert systemPointer (computer programming)Ocean currentFront and back endsComputer animation
Point cloudFacebookOpen sourceProgramming paradigmInternetworkingMultiplication signComputer animation
Transcript: English(auto-generated)
Hi. Good evening. Our next speaker is Dan Shearer. He is from Scotland. Now he is going to introduce us to LUMO SQL. It's a fork of popular database library, SQLite. Let's
welcome our speaker. Hello. Thank you. So, yes, I'm here talking about the LUMO SQL project, a very new project that is, it's not just me, and yes, I live in Scotland. Keith Maxwell, who is not here
tonight, he lives in Ireland, and there's one or two other contributors here in this very room. So, thank you to all those contributors. What I'm actually here for is the European Union's Next Generation Internet Initiative,
which is very grand. In fact, what it says is, re-imagining and re-engineering the internet for the third millennium and beyond, and they can't really help with that. That's far too grand. But what's a lot more interesting, and perhaps a lot more practical, is that the EU's NGI
initiative has gone and given funding to a variety of bodies around Europe, one of them being NLNet, based in the Netherlands, and they put it this way, that the internet is broken and we need to fix it. In fact, they say, tell us how to fix the internet.
So, with some friends, we did. Part of it is, I think, important part of it, is about databases. We decided to call it Loom SQL, and so, being in Belgium, of course, we have Meelo, one of the very few images on the internet that you are actually allowed
to reuse, mostly because it's in Belgium, but also because it hasn't changed in years, which would be the nature of much of our broken internet today, including databases, rows and columns, and RDBMSs being behind most things that are happening on the internet
now, even now, with fancy new column stores. What do we have in our famously broken internet? Please interrupt me if you think it's not broken. We have the main aspects are applications, which are very centralized,
which don't cope very well with scaling, which are very expensive to scale. We have the networking, as well as the apps, which is centralized and therefore broken,
because as soon as we have centralization, then we have security issues. I'm assuming most people in the room would be familiar with that. And then we have devices, which are a very large part of the internet now, the things in the internet of things, and they clearly work because everybody loves them, and they clearly don't work because they're insecure and they
break and they're unreliable, and there are various other things. The internet needs fixed. Thank you, NLET. They decided to support LumosQL. Is there something about databases, as used on the internet, that is particularly broken?
So what do we have in the most popular databases? RDBMSs. We have the famous names, the Postgres, the MariaDB, closed source ones, Oracle, and so on. But probably the most deployed would be SQLite. And I managed not to have my phone on me,
but I could, right here, be waving my phone, which has Android on it and therefore has SQLite on it, several copies of it, almost certainly. iOS, the same thing. Anyone who uses Mozilla has got at least one copy of SQLite. It's an embedded database that gives you an SQL-type
interface. Embedded meaning it doesn't do networking. So what we're looking for in, so I claim in next generation internet databases, is these things.
And SQLite, people love it, good reason to love it. It's been around for a long time, 91 if I remember rightly, and it does what it says on the tin. It's small compared to many other databases. It's moderately reliable except that any database developer who uses SQLite that
I've ever met has many stories about corruption. I've done various informal tests. Lots and lots of people report SQLite corruptions, but the good outweighs the bad.
And privacy, of course, SQLite doesn't do encryption. More on that later. And so it's not really a good fit for where we are at in the 21st century with very high performance devices that have a lot of concurrency on them, where corruption is increasingly a problem. Some electric cars are basically Android on wheels, and where privacy is
increasingly mandated, I stand right here in the European Union. I'm like, when I go back home. And so it was quite interesting in 2013, a well-known developer called Howard Chu made
a posting, it was noticed around the world, where he said he had taken SQLite and put a new key value store underneath. So the key value store is the very simple database compared to
SQL, where you have a list of items, one, two, three, four, five, six, seven, and then values against that. So number one might be table or chair, and that's all it does. And this is the key value store is at the bottom of just about every database. What he said was, I'm the author
of LMDB, one of the faster and more reliable key value stores around the place, and I have replaced SQLite's key value store. And so he posted some figures that looked very impressive. He's a very experienced database developer, and so lots of people said, oh wow.
And for a variety of reasons, it stayed at, oh wow, and where is that thing, until about a month and a half ago. And so at that point, the LumosQL project started and said,
let's do some code archeology. And so we had some components to that. Mr. D. Richard Hips, well known again since 1991, he released under the GPL the first versions of SQLite. It's no longer under the GPL, but that is its origins. And these are database
developers with decades and decades of experience. I don't know very much about databases at all, but still, someone had to do this. Howard, very well known again, developed LMDB, and this is a key value store that is remarkably small and still behaves quite like a database in
terms of its guarantees of consistency and concurrency and multiple threads accessing at once. It's got quite a small footprint, certainly smaller than the key value store under native
SQLite, even today. So he mashed these together and called a thing called SQL lightning. It was just a prototype. He knew it was a prototype. In fact, even the name was already taken, but that's great. He thought a thing needed done. He had a go, and all the world benefited
just a bit later, like two months ago. And so Keith and myself and one or two others who may wish to identify themselves later on in this talk got together and created LUMOSQL. Not a good idea to fix something that isn't broken. And so in general,
we can imagine that databases are broken on the internet because the internet's broken, but there are specific things that SQLite doesn't address and that really matter to a lot of people. Given that there is perhaps, depending how one counts, two or three billion people using
SQLite right now today around the world, that is a matter of some import if it's not really delivering on what these people need. And so you can go to SQLite.org and you can see
what their supported use cases are, and they make some exceptions. They say very clearly, we don't do high concurrency because you might get corruptions. And it says very clearly that there are various other use cases that will result in corrupted databases,
most of which are quite common these days. A mobile phone is a very powerful computer compared to not very long ago, and yet SQLite with its crashy corrupt ways,
it doesn't quite say what I meant it to say, but you understand, is deployed at scale. And so these are the things that are broken. Encryption is an issue because we're standing here in the European Union, which mandates that personal data shall be encrypted at some point soon. We expect a new regulation to arrive called e-privacy, and that will require end-to-end
encryption, including on the terminal device or mobile phone. SQLite doesn't support encryption in its open source form. You go and pay the people at SQLite.org and they will give you a
other way around that, but still, that's not ideal. SQLite is famous for having a really quite comprehensive test suite, and that is true. It tests the code. It doesn't necessarily
test use cases that are relevant to the users, and there's quite a difference there. And so it'll do things like 25,000 inserts and see how long that takes, and that's great,
but it won't necessarily do 25,000 inserts from three different writers at once, and so on. And so there's a lot of work to be done to take the SQLite that we have today, used at scale and loved, and make it more relevant to the 21st century.
So we've had some fun and we've done some things, and there is an absolutely really, really cool announcement type thing to make. We did some code archaeology. It wasn't easy to find what the antecedents of Loom SQL are. That's the boring bit, but it did take a lot of time
and effort. We've written a benchmarking tool so that we have some idea of whether we're actually improving things or not, or how bad they were in the first place. Benchmarking is not as easy as it may seem, as I have been learning working with Keith, who just loves it.
And we fixed some bugs, and we'll talk about that again in a minute, but there are a couple of blocking bugs that made it impossible to see actually how good the idea of SQL lightning was in the first place, and we've got some features. This is SQLite. By and large,
if it's on this side, then people who are SQLite users here will be used to the SQLite three underscore prepare API call. Those are the bits that are serviced by that service prepare,
and then this is the SQLite underscore SQLite three underscore execute or step API calls. And so Btree.c implements the Btree under which we have a pager,
under which we have some operating system specifics, but the pager is the thing that the Btree says I need to store a page, and the pager decides where it should go and whether
it should be journaled. That is the way that SQLite works at the moment. Very crucially, the pager handles write-ahead logs. And so again, those who are familiar with programming in SQLite, those are the two bits we care about. These are the two bits,
the one bit SQLite three step is where you're going to see the differences in what we've done with LumosQL so far. I'm going to come back to some of the details of what's been done. I just want to cover a very important thing. Write-ahead logs. So the pager gets to
do the concurrency bit and the security bit. Sorry, I should say the integrity bit for SQLite. So the idea is that there is a transaction, the transaction doesn't complete straight away because
there's many things going on, and so we write out the interim state to a little file called a write-ahead log file. If you're familiar with Postgres, you can see a whole lot of files called WAL files. And in fact, this is a tried and true 1990s database technique,
and every single major database used on the internet today uses write-ahead logs. Now there are two features of modern operating systems that mean that's probably not necessary anymore and actually not as good as we can do. And those features are journaling file systems,
which if you look at it, is like a special case of write-ahead logs. So if you're running this on say X4, then SQLite does its write-ahead logging or Postgres or MariaDB does all its write-ahead logging, and then underneath the file system just does the same thing all over again. And that doesn't sound very efficient, does it? And then the other thing is that we
have a virtual memory system that's really very good. A lot of time has been put into modern operating systems, virtual memory systems, especially Linux, but there are others we do acknowledge, the lesser people in the universe. And this is where we get the idea of memory
mapping an entire database or a level zero level store, I'm not sure what the technique is, I've just forgotten, where basically we allow the operating system to worry about all the
details of whether a page is on disk or in memory and to keep it as safe as we possibly can. There is a command called memflush, no it's not, memsync, thank you, and memsync will
as often as you wish to have safety keep the memmap image up to date on disk. So with these two advances in operating systems that really are the core of what makes a robust operating system, we don't really need write-ahead logs anymore, but that's a lot of
technical debt, Postgres has two and a half million lines of code I think, MariaDB I think has more. SQLite, bless it, only has 350,000 lines of code and it's still doing write-ahead logs, wouldn't it be great if we could eliminate write-ahead logs, which we have, and the way
we've done that is by making sure that LNDB works underneath SQLite correctly, which it now does as a very recent date, LNDB doesn't use write-ahead logs, that's one of its very strong features, the coolest thing is it memmaps everything,
it's as safe as it can possibly be using just one file, it doesn't have to have consistency with some journal, you have to replay after a crash, this is amazing, and so once we fixed a few bugs in in Lumo SQL, all of a sudden we got this benefit for free, because LNDB is
sitting at the B-tree level, completely replacing your pager and some of the operating system specific things, and we now have the world's first database API, I have to put it that way, used at scale which does not have a write-ahead log, that's absolutely amazing,
I like it, I'm quite enthusiastic about it, so we have some other things with Lumo SQL, where we are, what we've done, what we'd like to do, it's a really baby project, that's the first thing, I mean we're talking like two months old, NLNet has made this possible,
we're starting to get a good idea of where it's going next, we need to talk to NLNet about just where it goes and how fast, there's some very very important and I think cool new features to come, but just before we
go on to that, are there any comments from some of the expert database users and implementers in the audience of which I know there's at least four, okay, very very conscious that the amount of database expertise and all the millions of lines of code that I've referred to
is huge, there are people who have spent their entire professional lives making a reliable row and column stores that the internet runs on today, and I guess what do you say, on the shoulders of giants we stand, so one of the things,
the very important things that we are really required to have in a database these days is reliability and detecting whether what we wrote consistency, so the acid things right, atomicity, consistency, integrity, the middle two are integrity and consistency,
what we have with all of the major databases and including SQLite is that external processes have been designed to go around and check as to whether what's on the database or what's on the
if you look at a running Oracle database, you can see there are these processes that are frantically going through the database and trying to to do its integrity checking completely separate from whatever the applications might be trying to do,
that's quite an interesting thought, you've got concurrent access by an integrity checker looking over the shoulder of a thing that's supposed to be writing reliably, so basically what that says is that we don't have any good way now of getting a mainstream column and row database and knowing pretty sure that what I just read from disk now
is what was written last week and that doesn't seem like a big ask for the 21st century, but it does seem to be a problem and so we have an idea and we know how we're going to implement this idea, we hope it's going to work, it sounds easy enough, although we've already discovered
some hairy use cases, corner cases and the idea is this, what if each row had a checksum, how about that, so every time we update a row we keep a little checksum on the far end and then when we read it in off disk we had a pretty good idea whether what we read just
five minutes ago was what was written last week, I haven't found any major database that does this, I don't know why, but if I'm required to have consistency it seems like a good way of doing it, anyway that's what we're implementing, I'll let you know if it works,
so that's quite exciting, so what we have to do is make a first release, we have to have a way of having multiple backends and I'll be going back and talking about those multiple backends in a minute
and we are going to implement no write ahead logs and be sure that that is right because you don't do this lightly and we're going to implement per row checksumming, that's what's in the very short-term roadmap, the backends, back-end muxer
means we've got multiple backends and we can use one or all of them at once, now at the moment we used to have two backends and we can't really switch between them, we have the classic SQLite backend with the B-tree code which is well used and at least if it doesn't work sometimes we can't know how it doesn't work, we have the LMDB backend which has really only
been passing all of the SQLite tests for a very short time now, so I'm not going to say that's production ready, but already we have pretty clear designs for other backends,
so you can think of other key value stores, there are reasons why you might want other ones in there, going higher up the architecture diagram you can imagine that it's going to be an interesting place to put in some networking facilities, that isn't within the next month or
two, the other things I've discussed are within the next month or two, but these are the things we're trying to do, one of the big big elephants, non-postgres elephants in the room is how do you track upstream SQLite, and we've got some ideas about that, we haven't got a definitive
answer about that, but one thing that you want to make absolutely clear is that right now we are not reinventing SQLite, there is a lot of code there, if you look at the left hand side of that architecture diagram that I showed you earlier on, there is a good deal in getting a statement,
parsing a statement, preparing it for the virtual machine, and so on, and we don't in any way want to re-implement any of that, and we want to have all their bug fixes, we believe we can solve this problem by judicious use of good APIs, I'm pretty confident about that, but in a nutshell that's Limos QL, that's where we're up to, that's where we're going, and next stop 3.5
billion people's pockets, and there we are, do we have any comments at this point, I mean there's a lot more, I've been skipping over the highlights, we have a question, I think that was absolutely
a planted question, what makes LMDB much better than any other key value stores,
okay I'll take that one, so firstly as we've already talked about, it's a single level store, so it only has one file, it maps everything in memory, and if everything goes well, as we would hope, then there's a much lower risk of corruption and loss of integrity, it is also
actually very efficient, there's quite a few others, there's level DB, and there are other design architectures out there that aren't based on B-trees at all, there's the LSM log structured mergeback systems, and there are quite a lot of other key value store libraries,
LMDB is extremely widely used, and that is because it's almost a drop-in replacement for BarclayDB, and BarclayDB can't really be used anymore because Oracle changed the license on it, the keyword is sleepy cat if you care about the history there,
so LMDB is extremely well tested, it's the bottom of OpenLDAP, which is the project out of which it came, but now you'll find it under all kinds of other projects, well-known ones include Samba and bits of Mozilla and so on and so forth,
LMDB therefore is very performant, it is known to be quite reliable, you can get corrupted databases out of LMDB, but I am going to go out on a limb and say not nearly as often as SQLite,
and it has a very small footprint, much smaller than many other B-tree code bases, including SQLites, so the design goal for LMDB was to fit into L1 cache on a typical reasonable CPU, and it manages that, that's really cool, because if you can keep the cache hot, then your performance overall increases, and that really matters on
funky modern architectures, that's a start, will that do so? Thank you, Luke.
I did the Wikipedia page, the corruption occurs if you don't switch on fsync mode. Corruptions and all kinds of dreadful things happen with LMDB if you're unaware of its sharp edges. So there was another question there. Are you aware of actor DB, which is actor DB?
No, I'm not. Okay, that's actually did this combination of SQLite and LMDB underneath of it several years back, and they managed that, or they called that a database node,
and then on top of that they have an Erlang layer, and that manages a number of these kind of nodes. So does this still exist, is it an active project? Yes, it's an active project on GitHub. Marvelous, I can't wait to meet the team. I would have a look.
Okay, excellent. What we've got is the beginnings and soon to be much better benchmarking tool. It's already for a few years in operation. Marvelous. Never heard of it. Thank you very much, I'm going to follow that up with enthusiasm.
No more questions? Did you happen to know the performance limitations of LMDB? I don't, and I'm skeptical that anybody actually does. I don't want to take that
any further, because I want to go on to some other questions. First of all, without naming any names, is there anyone in the audience who would like to talk about what we know so far about LUMOSQL internals? Because we fixed some bugs, and it's quite interesting as to the
nature of some of those bugs, and what it might say about where we go to look further for performance enhancements. We can perhaps do that as a prepared talk then another time. We certainly have a skilled bug fixer in the audience tonight. What I'd like to just quickly run over
some of the things that we've discovered. Howard's original port was a prototype only. This would be very interesting to compare to how Ektor DB.
Ektor. Oh, as in the name, Ektor. Okay, an Erlang community thing. Okay, so I'll be very interested to see how the different approaches compare. Just replacing btree.c is a fairly
limited and short-term approach. That's our conclusion so far. It's great. We need to do this. We need to have multiple key value stores down there, but if you've got a general purpose and highly effective and widely used SQL interface at the top end, it seems a bit limiting
just to have a key value store as the ultimate destination in the back end. That is something that we're very keen to introduce an API for so that we can not only switch between key value stores on disk but also between different network models and other ways of storing and retrieving
data. Where we fit in the API level in the architecture diagrams of this is active consideration right now. Is there anything further? Do we have SQLite users here?
Is it doing exactly what you want? Not really. Why not? What's the problem? SQLite is too slow. Here's an interesting thing. A common use case for SQLite,
really common use case is in the build process for other projects that use a real network database and they put SQLite in there because it's quicker and lighter to start up and faster.
I also think that SQLite is quite slow. This is the beginnings of what we can see with the benchmarking we're doing. It has particular hotspots, but it's really interesting that if you're finding SQLite slow, that probably means, depending on your use case, you'd find a real database unbearably slow. This isn't what we want in this century. I feel your pain and
that is something that I'm trying to look at quite hard as to how can we make this thing faster from top to bottom. Even though there is only 350,000 lines of code as compared to millions, there is still quite a bit of technical debt in SQLite, so much so they tried to, a few years ago,
make a SQLite 4, which didn't work, but they identified quite a lot of historical craft that they would like to do without in a new version of SQLite and that remains true today. Some of
that is why it's slow. I look forward to finding out whether we can actually make this thing faster and I don't know the answer. I think it's very likely we can. I certainly think that there is a lot of benefit to be had from talking to the really good experts who've been working on this code base for years and years and years
and they know that there are some bits that haven't changed and why and maybe how it could change. Any other good or bad things about SQLite in your current users? Speed, yes. Corruption, we've discussed. Is there anything else that could be maybe better?
The API. That's the one thing that mostly I don't want to see change. Now, noting I did say encryption, but there is encryption already in the API,
so we wouldn't be introducing anything new there. The point is it's at scale billions in its usage worldwide and that is the API that people are using. I'd be very reluctant to change that even a bit. The only thing I have thought of is that there's older API interfaces associated
with quite a lot of code that we could perhaps, after great care and consultation, drop from the code base, but SQLite3 underscore API, I would feel very funny changing those in an incompatible fashion. Did you have something in particular you hate?
Okay, but SQLite has all kinds of APIs, all kinds of wrappers, all kinds of languages.
The main point of LumosQL is to be able to seamlessly go wherever SQLite is today and hopefully with better effect, or if not, at least we'll produce benchmarking and stability results that will be of use to the entire SQLite using community. Anything else?
Did you add any unit tests or benchmarks for concurrent reads and writes yet? At the moment, we're really pleased that we've got the existing ones working and what we're doing is creating a benchmarking tool that compares consecutive runs with each other, which is not something that's SQLite.
Now that's the thing. There are three test environments for SQLite. The oldest one is the TCL, which is very extensive, lots of code there, and that does functionality testing and
it really has got a lot of coverage. Then there is the SQLite 3 Correctness. Something like that, the name of it has just gone to me at the moment, but it is an SQL Correctness testing engine that can run against pretty much any database. The only thing that cares about
is if you put in a certain amount of data, a certain kind of data, do you get the right answer back? But the third one is what we are told is a very excellent, fast, and even much more comprehensive test system for SQLite, which you can get access to if you go to
SQLite.org people and pay them money. A lot of people do seem to be thinking that that might be a bug. How it's addressed, I don't know, but that is certainly a question that keeps popping up. So there we are. That is the introduction to LumosQL, which is a brand new
project and we keep finding exciting new things every day. We keep finding that benchmarking would appear to be the answer at the moment and we keep finding people who say, oh, I can imagine
contributing, and we're trying to make sure that they do. There we are. I don't have anything more to say except that we are in Belgium, right, and I did say that NIL hasn't changed for years and years and years. I spent so much time with the Marsupi army, I just have to finish with them because we really do need new internet paradigms. That will do for me.
Belgium, what can you say? So thank you, ladies and gentlemen. That's LumosQL.