Help! I Need To UnSQLize My Application
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 56 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/67197 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Berlin Buzzwords 20227 / 56
22
26
38
46
56
00:00
Musical ensembleWordBitPoint (geometry)Relational databasePairwise comparisonCovering spaceDatabaseCartesian coordinate systemQuery languageEndliche ModelltheorieMappingWeb pagePresentation of a groupXMLUMLLecture/ConferenceComputer animation
01:49
Software developerTouch typingUniform resource locatorSoftware developerTwitterLink (knot theory)Shared memoryInformationComputer animation
02:39
Data modelDatabaseAdditionDifferent (Kate Ryan album)Right angleEndliche ModelltheorieLimit (category theory)EvoluteReading (process)ScalabilityMeeting/InterviewComputer animation
03:20
Tap (transformer)Boiling pointData modelEmailEndliche ModelltheorieDatabaseField (computer science)Category of beingPointer (computer programming)Table (information)Operator (mathematics)Different (Kate Ryan album)Arithmetic meanSinc functionWebsiteWeb pageQuicksortTerm (mathematics)Natural numberObject (grammar)BitFile formatInformationRelational databaseDemoscenePublic-key cryptographyMultiplication2 (number)Type theoryMoment (mathematics)String (computer science)Boolean algebraData storage deviceDefault (computer science)Numeral (linguistics)Series (mathematics)Integrated development environmentNumberRight angleData typeMereologyCASE <Informatik>SynchronizationLevel (video gaming)CausalityComplex (psychology)Product (business)Software developerComputer animation
09:11
DatabaseNP-hardData storage deviceNatural numberSource codeComputer animation
09:43
DatabasePower (physics)Endliche ModelltheorieHydraulic jumpMappingEquivalence relationData storage deviceInformationRow (database)Different (Kate Ryan album)View (database)Query languageRelational databaseObject (grammar)Database transactionMixed realityTable (information)Direction (geometry)Subject indexingOperator (mathematics)Order (biology)Block (periodic table)Data modelKey (cryptography)MereologySeries (mathematics)Server (computing)Integrated development environmentField (computer science)Multiplication signRight angleOracleMultiplicationComputer animationDiagram
15:08
TwitterSoftware developerKeyboard shortcutBitSingle-precision floating-point formatOrder (biology)Validity (statistics)Type theoryMultiplication signServer (computing)AdditionDatabase normalizationLimit (category theory)Point (geometry)Ideal (ethics)CASE <Informatik>BefehlsprozessorDatabaseComputer configurationMathematicsSource codeField (computer science)Different (Kate Ryan album)Data structureHard disk driveFile formatSubject indexingEndliche ModelltheorieOverhead (computing)Cartesian coordinate systemScalabilityTheoryAuthorizationVirtual machineScaling (geometry)Medical imagingRelational databaseHigh availabilityComputer animation
20:17
Object (grammar)Software developerInformationMeeting/Interview
20:53
Game theoryReal numberDatabaseCodeBitString (computer science)Connected spaceClient (computing)Right angleComputer animationSource code
21:29
MassMultitier architectureDatabaseFreewarePoint cloudProjective planeSpacetimeComputing platformString (computer science)Connected spaceResultantLine (geometry)Single-precision floating-point formatRelational databaseCodeSelectivity (electronic)Type theoryDifferent (Kate Ryan album)Field (computer science)Loop (music)Object (grammar)BitOperator (mathematics)Game theoryLatent heatTable (information)Matching (graph theory)Computer programmingAdditionSystem callCASE <Informatik>Front and back endsFunctional (mathematics)Filter <Stochastik>Right angle
25:26
Gamma functionCartesian coordinate systemCodeBitFocus (optics)Type theoryInsertion lossInformationMeeting/Interview
26:02
Price indexCommodore VIC-20Operator (mathematics)Series (mathematics)Level (video gaming)Matching (graph theory)Cartesian coordinate systemField (computer science)CodeCountingSet (mathematics)Right angleSource codeComputer animation
27:00
Network topologyApproximationOutlierSubsetPolymorphism (materials science)Cartesian coordinate systemMultiplication signAuthorizationCategory of beingHuman migrationTable (information)Pattern languageProfil (magazine)DatabaseLink (knot theory)Shared memoryType theoryLevel (video gaming)EmailComputer animation
28:47
Order (biology)Relational databaseInformationData structureScripting languageQuicksortDatabaseEquivalence relationAdditionBitMixed realityMereologyLattice (order)Right angleResultantEndliche ModelltheorieKey (cryptography)Multiplication signTable (information)CodeField (computer science)Computer animation
31:04
Service (economics)Table (information)Link (knot theory)View (database)Selectivity (electronic)Equivalence relationDatabaseObject (grammar)Query languageMatching (graph theory)Order (biology)Type theoryField (computer science)Computer animation
32:02
Subject indexingLink (knot theory)AuthorizationView (database)Latent heatRelational databaseInformationEndliche ModelltheorieQuery languageFood energyMultiplication signTable (information)Point (geometry)ResultantComputer animation
33:10
Field (computer science)View (database)Object (grammar)AdditionField (computer science)BitInformationDescriptive statisticsTable (information)GodSource code
33:48
Right angleDatabaseStability theoryObject (grammar)MereologyField (computer science)SoftwareTable (information)Multiplication signQuery languageData storage deviceSoftware developerDrop (liquid)Computer animation
35:00
Object (grammar)Pattern languageNumberBoiling pointObject (grammar)MereologyAuthorizationInformationLogicMappingDescriptive statisticsLatent heatLink (knot theory)Web pageRight angleComputer animation
35:35
Cartesian coordinate systemObject (grammar)Extension (kinesiology)Web pageMathematicsAuthorizationLink (knot theory)Object (grammar)MereologyPattern languageInformationNumberProfil (magazine)Complex (psychology)Extension (kinesiology)Field (computer science)DatabaseCASE <Informatik>Reading (process)Lattice (order)
36:47
Service (economics)Numbering schemeObject (grammar)Subject indexingGame theoryProfil (magazine)AuthorizationQuery languageWeb pageField (computer science)DatabaseNumberPattern languageReading (process)NeuroinformatikLatent heatOperator (mathematics)CountingSystem callMereologyQuicksortMultiplication signMechanism designComputer programmingRight angleTouchscreenInsertion lossComputer animation
38:16
Field (computer science)Web pageLink (knot theory)Ocean currentRelational databaseEndliche ModelltheorieDatabaseHuman migrationInheritance (object-oriented programming)
39:07
Mass flow rateGene clusterTouchscreenRelational databaseSlide ruleInformationPresentation of a groupLink (knot theory)BlogSynchronizationHuman migrationPattern languageData storage deviceDatabaseUniform resource locatorMeeting/InterviewLecture/Conference
40:38
Absolute valueScalabilityDatabaseEndliche ModelltheorieRelational databaseLecture/Conference
41:17
DatabaseDifferent (Kate Ryan album)Endliche ModelltheorieMeeting/InterviewLecture/Conference
41:56
Absolute valueMultiplication signHTTP cookieAreaFreewareMusical ensembleLecture/ConferenceMeeting/InterviewJSONXMLUML
Transcript: English(auto-generated)
00:07
Thank you so much. Thank you for the introduction. I like to invent new words for my titles. So I need to un-SQLize my application. So first of all, thank you very much, everybody, for being here.
00:22
I know that, unfortunately, I'm not there in person. I was really looking forward to get a query versed in Berlin. But unfortunately, that didn't work out. But I'm glad to be here still in 2D. That'll be good enough for now. So we'll talk about un-SQLization of applications
00:40
in this presentation, so basically how to migrate from a relational database into a document database such as MongoDB, as you might have guessed. So what are we going to talk about today? Let me see if my clicker actually works. Takes a second here.
01:01
I think I got it. There you go. So what we're going to cover today, just to make sure everybody's on the same page, I'll talk a little bit about what is SQL. I'll assume that you all know what is SQL at this point. And I'll talk a little bit about NoSQL. More specifically, I'll be talking about the document model databases.
01:20
And then I'll talk a little bit about some mapping, so just to make sure we all have the same terminology. So I'll cover just comparisons between SQL or undocumented databases. Next, I'll talk a little bit about some benefits and finally, we'll look into unSQLizing an application. So I'll go into the details. I'll take an application that I have, or at least a schema from a relational database,
01:43
and then show you how I would migrate that into a MongoDB database. So before we just get started, so hi, my name is Joel. I work as a developer advocate. Full disclosure, I do work at MongoDB. I am based in lovely Ottawa, Canada,
02:01
so that's why it wasn't very easy for travel. And if you ever want to get in touch with me, Twitter is always, always the best way. So you've got it right there. Nope, on the other side. So you've got my Twitter URL there, or Twitter username. That's always the best way to get in touch with me.
02:20
There's also one link that I'll share at the end with all of the different links that I'm talking about, as well as all of my contact info. So feel free to reach out to me if you've got any questions, comments. I'll be more than happy to answer those. All right, so SQL versus NoSQL. So first, NoSQL is kind of a very broad thing, right?
02:43
It could mean a lot of different things. It's like a NoSQL. What does that even mean? First of all, interestingly enough, this is usually it means not only SQL, so it's like an addition to SQL databases. But really, it's like an evolution of your traditional SQL databases.
03:02
There is a couple of people a couple of years ago that were trying to use SQL databases, and they came to some limits. There's a couple of limitations that you will run into, namely with scalability and the speed of read and writes that you get with a SQL database. So they came up with this idea of the document model.
03:22
So just to make sure that everybody's on the same page again, let's take a quick look at what a document model would look like as opposed to a traditional SQL relational database model. I like beer. I'm guessing most of you also are.
03:40
I mean, this is Berlin, after all. But seriously, I like to brew my own. So I have this website where I want to list all the beers that I currently have on tap, and I also list the recipes and all that stuff. So, and I've been working with SQL database since the last millennium.
04:00
So it's kind of natural for me to think in terms of relational databases. So if I were to create this database to have my beers listed, I would have a new table called beers, and then I would have an ID, the name, as well as a Boolean for, is it on tap or is it available at the moment or not?
04:22
So this is my first beer. You see, I've got my porter there, and it is currently available on tap. So, you know, you should all be somewhat familiar with that. Now, if we wanted to add something like, if we wanted to create that into a document database, well, we would have a document.
04:41
Document is very similar to a JSON object, basically. So you have those curly braces, and then you've got those key pair values. I've got my ID field, underscore ID is the default ID for a MongoDB database. Then I've got the name as well as the isTap property set to true. So you can see that I've got strings in there, I've got numbers, I've got Booleans.
05:02
So you can kind of still use those different types as well. Now, if I wanted to add extra things, so let's say I wanted to add the recipe for this specific beer, I will need to add a couple of series of instructions. And in my traditional SQL database, I could add different columns for each one of the steps.
05:22
But obviously that's not ideal, right? Cause I don't, it's not a fixed number of steps. So the way typically you would do that is to create a new table that would have a relation with the actual beer and then, so beer ID equals one, and then each one of those instructions, they're all listed there. So now you've got two tables and you've got that information separated
05:41
those two different tables. If I am looking at my document model again, the difference here is that I can actually embed all of those steps into an array, which is still part of my same object. So you can see already that all the information is stored together, and it's all available into that single document. That makes it much better,
06:01
much quicker to access the data and to be able to have access of all those properties without the need of getting information from different tables. And you can go and make it a lot more complex. So you can see here that I've added a bunch of different beer names in there.
06:23
And for my Mango IPA, for example, you can see that it has all of those properties. So it has the recipe, the instructions that we've seen, has the grains, which is an array of objects in this case so which object specifies the name of the grain as well as the quantity, same thing for hops and other ingredients and on and on and on. And I can really expand,
06:41
I can really create those very complex objects that describe the information that I want to store. And another thing, so yeah, all the details there. And if you want to add a field to your relational database, sometimes it's, well, if you're managing your database, it's kind of easy, but when you have a DBA
07:03
and you've got some very complex environments, you've got a staging environment, development environment, your production environment, you've got to make sure they're all in sync and it can get really tricky to add, just to add another field. So it seems like a simple operation, but it can require a lot of work. And sometimes you'll need to ask permissions,
07:21
synchronize with different teams. And again, that can get a little bit tricky. When we're looking at the document model, one of the benefits is that you can just simply add that field to your document. So I've added it here. And if you compare it to your table, where I had to add a bunch of nulls
07:40
and then add property for those that I know property and what this null means, does it mean zero? Or does it mean, so in our traditional SQL database, that can be a little bit tricky. But in my document model, I've added it right there at the end of the document. That was very easy. And if I compare that to my other items
08:00
that I had in my database or in my collection, I haven't changed anything. So I've got access to this capability of adding different fields, different columns to all of my documents and all of my entries when I'm using the document model here. All right, so you might think, a document database that sort of saves JSON objects,
08:22
like big deal, could just use text fields, store those JSON objects inside of it. Sure, I mean, that could work, but it's a little bit more than just JSON. So it's actually using BSON behind the scene to store the information, make sure that it's stored in an efficient format, but most importantly, to make sure that you can query those fields.
08:41
So if you were just to store your JSON objects into a text field, searching is not ideal. You can't really store rich data types as well. So when we're looking at BSON, we have access to, well, those strings, those numbers, those Booleans, as I've shown, but you also have a date type. You also have multiple numeric types.
09:02
So you can really use all of those specific types by using extended JSON and ultimately get stored into BSON in our database. All right, so that's all nice and all, but when I started looking into those document databases,
09:20
it really wasn't obvious to me. I was trying to figure out like, okay, well, how do I store that data? It's still kind of hard for me to figure out, like, should I just embed everything into one single document? And it took me a while to figure that out. Like I said, I've been dealing with SQL databases since the 2000s. So it's kind of natural for me
09:41
to denormalize all of that data. And this is one of the things that someone told me that really made sense. And it really helped me to understand how should I model my data? And that is the one thing that you should remember from this doc is data access together should be stored together. And that is where you will really reap all of the power
10:03
of using the flexible schemas of a document database. All right, so before I jump into that, let's take a look at some of the mappings, just to make sure that we've got all of our terminology correct. So the first thing, when we're looking at a document database, such as MongoDB, we'll be talking about documents.
10:21
What are documents in a SQL or a MySQL or Oracle or MS SQL server database? Those are the equivalent of a row. All right, so each row in a table, those are equivalent of a document inside of our document database. Now, a document is equivalent to a row, but it could also be equivalent to multiple rows.
10:40
So I've kind of hinted at that in my introduction there. So I was able to get information from different tables and store that into one single document. So instead of using complex joins, now I can all have everything inside one single document. When we talk about fields, those are key value pairs, similar to your JSON object.
11:02
Those are the equivalent of a column. The columns can be optional inside different documents. So I've shown that in the intro again. So a column doesn't have to be exactly the same on each one of the document, but you can enforce that if you want or need to. So it's kind of a misconception
11:21
that you can just store whatever you want. You can actually enforce your schema and make sure that your data is always exactly the same from one document to another. Collection is the equivalent of a table. So it's exactly the same thing. It's just a series of documents that are stored into a database. And when we talk about tables,
11:41
of course, they're just a collection of rows that are stored together in a database. This one is easy. A database is the equivalent, exactly the same thing as a database in a more relational environment. Indexes, you can create indexes. Indexes are amazing, actually. I hadn't really dig into it,
12:02
but when you start dealing with a high amount of data and large, large collections, indexes will completely change the speed of your queries. It's day and night. So start using indexes if you don't already, and they work pretty much the same way as they would in a relational database.
12:22
Views, same thing. You can create views in both SQL or non-SQL environments, as well as, oh, when we talk about joins, that is where it gets very different. So there is multiple ways to do joins in MongoDB. A very easy one, or one that is used a lot of the time,
12:43
which is probably the direction you should go in, is to use embedded documents. So instead of creating a join in another table with all of that data, you just embed it into the same document. And again, if data needs to be accessed together, just store it together. Then you have access to all of your document. Modeling, then you can just extract those documents
13:01
from the database. So it's much faster than trying to join two tables, get that information together. It takes a lot of RAM to perform that operation. So by embedding everything into a single document makes it much faster. However, and that is the one thing that really, really was hard for me. I was trying to move to that denormalized mindset
13:22
into SQL, and then I was trying to put everything into a single documents, but you don't have to put everything into a single document. You can use database references to do something very similar to a join, or, and I'll actually show that one later on, you can do lookups. Lookup is exactly the same thing
13:41
as a left outer join in a traditional database. So you can do that. There's nothing that prevents you to have that same data model that you had earlier, and then just use lookups to join the different tables. However, that might not be the best way to do it, but a mix of embedded documents and lookups
14:01
is usually a nice way to do that. All right, finally, we've got asset transactions, which are definitely possible, both in MongoDB as well as in SQL. Again, if you're using them though, you might have to think about whether you actually need them. Now, because you can embed multiple things
14:21
into a single document, maybe you don't need those asset transactions anymore. So that is sometimes a smell that you should probably look at your data model and see if there's ways to embed that information into a single document so that you don't have to do a block in order to make sure
14:40
that you've got multiple operations happening as part of a transaction. A lot of the time, replacing asset transactions with just moving everything into a single model or a single document will really, really save you. It will help you save on time, increase your performance and on and on and on.
15:00
So if you need asset transactions, definitely look at if there's a way for you to model your data in order to avoid that. Well, that went fast. Okay, I just realized that I can use a shortcut on my keyboard. Okay, so let's talk a little bit about the benefits of NoSQL.
15:22
There's a couple of benefits of using a document model database. One is scalability. So when we're thinking about traditional relational databases, if you need to scale your application, if you need to put in more data, you'll need to increase the size of your server. And you can add more CPU and add more RAM
15:40
and add more hard drive. But you'll get to a point where you just can't add anymore. You will hit some physical limitations. There is so much RAM you can put into a machine and that can be an issue. In addition to that, every time you try to scale your machine, you might have to take down your server in order to upgrade that server. So that means downtime, which is less than ideal.
16:04
One of the big benefits of using document databases is that you can actually scale horizontally rather than vertically. What that means is that you can just add new servers. And whenever you need to increase the size of your database, rather than expanding and getting this bigger, bigger, bigger server,
16:21
you can just add another server and split the data across those two servers. So now the data can be gathered from those two sources and it's queryable exactly the same way as it would as if it was one server. So you can just keep on adding more and more and more servers without necessarily growing them. Now, it is definitely a big benefit
16:42
of using a document databases. However, it's not always the best way to scale. You can also scale vertically and sometimes that makes more sense. So you really have to take a look at how much it will cost and there is a little bit of an overhead when you're scaling or sharding your data in order to scale it horizontally.
17:00
One of the things that you need to keep into consideration is that when you're dealing with a MongoDB cluster, you always have three servers. So right here in this image, I only have a single server, but in reality, there are three servers creating a cluster here. So for redundancy and high availability, and if I scale, I'm adding another three servers and another three and another three.
17:21
So you're just adding more and more and more servers. So sometimes you need to take a look at the cost and see if actually increasing the size or the CPU and RAM of your servers, maybe it'll be cheaper. So it really depends on your use case. You'll have to do a little bit of math to make sure that scaling makes sense,
17:41
whether it's horizontally or vertically, but you have the option to do both, which can be useful. One of my favorite benefit is the agility of using a document database. If we look at the items that I had in my fake database, you can see that those two documents, although they do represent the same type of thing,
18:00
which is in this case, a beer with its recipe, they're kind of different. So this one has a couple of different fields that the first document doesn't have. And that's a very, very powerful feature. So you can really create those documents that can change from one to the other. Essentially, you could put anything into a single collection.
18:22
So in this case, I could put my beers as well as the authors. So, you know, I could add a field for myself, although that would be a very, very bad way to use your database. So yes, in theory, it is possible to put anything in there, but you'll still wanna make sure
18:41
that those documents are somewhat similar in nature so that you can actually query those documents so that you're able to create indexes that will make it faster for you to be able to query those fields. So you'll try to make sure that your documents are somewhat similar, but there's a lot of benefits of having that flexible schema still so that you can change and use the format
19:03
of the data that you want. So say I've got this application listing all of my beers that I have on tap, but now I want to pivot. It's COVID and I want to create bread now just like everybody else, right? So I could start adding those bread documents inside of the same collection.
19:22
And because they're somewhat similar, the recipe is pretty much the same thing. There's yeast, there's the same type of recipes, there's the instructions that I need there. So the documents are very, very similar. So I can actually use those two different documents inside of a single collection without changing anything on my database itself.
19:41
As I mentioned, there's also ways to enforce that schema if you need to, to make sure that all of your documents are always following that exact same structure that could also be beneficial depending on what you have there. And now one way to deal with those type of somewhat different documents,
20:01
they're similar but slightly different. Maybe you want to represent them slightly differently in your UI, you could use schema validation. So say, well, this is actually my schema one, this is my schema two, and use that in your UI to display things slightly differently. But I think even better than the agility
20:22
and it's kind of related to that, but it's the developer experience. For me, I just love working with those documents. I mean, if we look at a document, how it's stored and how you can extract it from MongoDB, this is what it looks like. So we've got all the information. So the beer ID, the name, and on and on and on. You've got an array with all of the grains right there.
20:42
And if we look at how we use it in JavaScript, well, it's exactly the same thing. As you can see, it's exactly the same object that I'm using in my UI to show and display the information about this item. And that is a real game changer. Now we're always dealing with that exact same document,
21:01
whether we're dealing in our database or in our UI. If you want to connect to, or if you want to look at a little bit of code on how to use those documents, so the first thing is always to connect to our database. So the first instruction and whether you're dealing with MongoDB or MySQL or whatever,
21:22
it's always pretty much the same thing. You define a client and then you'll connect to the database using a connection string. If you want to give it a try, MongoDB Atlas is the cloud offering for MongoDB. They have a free tier platform. Definitely give it a try. It's like 500 megabytes of space, which is more than enough for a lot of pet projects.
21:42
I actually have a couple of projects running on it. And you can see here that directly on the UI, I can get my connection string. I just copy that, paste this in my code, and I'm good to go. I can connect to my database. Now, if I want to read a document from my database, this is what the code looks like in JavaScript again. So I define my function, get beer,
22:01
I give it a beer ID, and then I get my result. So db.collection.beer.find one, I give it the criteria and it will return me a single document that would match that criteria. That's it. So it's relatively simple to use that code. It's that single line of code that is what does everything. And I'm just extracting that document from the database.
22:23
If you need more than one document, get all, you can use find with no specific criteria. So it'll just find all or return all of the documents that you have in your collection, or you could specify different criteria or different filters, such as is that equals true, for example, that will return you all the documents
22:40
that match this specific criteria. So you might think, well, that's exactly the same thing as a SQL database, right? We just do select star from beers where is that equals true, and we get all of the documents back that match that specific criteria. Now, that is kind of true, but there's a little bit more than that when we're looking at a traditional relational database.
23:03
So if I was to read all of my documents that match that specific, yes, I would do that select star where ID equals something, in this case, I'm just fetching one document. But in addition to that, you also need to, well, get all of the instructions for the recipe, and then you need to parse them
23:21
and add all of those to the object that you'll be returning to your UI. And then you also need to do the same thing for all of the grains and all of the hops and all of the other ingredients and on and on and on. So that object requires a bunch of different reads from your database, and then you've kind of have to process all of that,
23:41
create that object into your UI and then, or in your backend, and then send that to whatever program is gonna use that object. So if you compare the code, you can kind of see a difference here. So there's a lot more code involved because everything is stored into one single document that we can directly use into our UI.
24:00
Again, data access together should be stored together. There's just one call we do, and we can immediately start playing with that object. So that's really a game changer. Same type of thing goes for all of the operations. If you need to update a document, this is kind of the code. So first you need to delete all of the entries from the other tables that are matching there.
24:21
So all of the items from the recipe table, all the items from the grains table, and so on. And then you'll loop through your object and then add a new entry for each one of those grains, each one of those hubs and on and on and on. And then finally you do that huge update line, which is just a nightmare for me. I would just hate those
24:41
when I used to use traditional relational databases update where field equals X, field equals X, and on and on and on. And then you need to specify the array of values and you've got to make sure that they actually map at the right place. So if you add a new field in there, you need to make sure it's at the right place or else you get this big mess into your database
25:02
and that can be very dangerous. Also, you've got fields that sometimes are there, sometimes are not. So you've got to make sure that you're still adding an entry. So that can require a lot of work. Compare that to MongoDB again. Same type of operation update one, specify the criteria and pass it the object as a whole
25:22
and you're done. So that's all you need. Just pass it that object. So again, you get that very, very short bit of code. So you can really focus and jump right into your code and focus on what actually brings value to your application rather than having all of that overhead code that you need to write.
25:41
Just for more information, if you are interested in creating a document, it uses the insert one or insert many and to delete documents, same type of thing, delete one or delete many to create and delete documents. One of the really powerful features is the aggregation pipelines.
26:03
And this is kind of what it looks like. So this is the UI for MongoDB. You can go into aggregations and create those series of stages that will perform operations on a dataset to analyze and transform and do all kinds of advanced operations. So you can see here that I can use a match stage
26:22
that will enable me to filter data. And I can see the preview on the right here. So I was matching on is tapped. I could also match ABV greater than five and that'll return me the documents that match that criteria. But let's say I wanted to know how many beers that do I have on tap. So I can just use is tapped true.
26:40
I can see that there's two documents but I wanna return that programmatically. So I can add a second stage called count and then I can specify the name of the field. And then you can see that this is the document that is returned to me. So beers on tap two. And then I can go to the aggregation pipeline builder just get that code, copy the code that I need. And then finally I can use that in my application
27:03
just like this. So this will always return me a single document that has beers on tap with the value that I'm looking for. So those aggregation pipelines can get really, really complex, but they're very, very powerful tool to use all of that data. All right, time to un-SQLize an application.
27:24
So there's a couple of patterns that you can use. I won't go through each one of them. I'm gonna be using two of them, three technically. I'm using polymorphic because I'm using slightly different documents between all of the entries. I'll be using computed and extended reference as well. I'll get to those.
27:41
The patterns are very, very interesting ways and things to think about when you're migrating from a traditional database into a no SQL database. I have a link to an article that describes all of them at the end. So stay tuned, I'll share that link after we're done. All right, so this is my current database,
28:00
my current schema. So you can kind of see almost in the middle I've got a beer stable which has all of those properties. I added an author so we know who created that recipe. The author is in the brewers table has name profile pic, email, and then a bunch of socials. Not ideal because every time I wanna add a new social I kinda had to add a new column
28:22
but we'll take a look at that. I've got all my other tables such as other ingredients and instructions which are one to many relationships. And then I've got beer to beer grains to grains which is a many to many relationship. And so I select the grains from one single table
28:40
and then I map that to with the quantity to my recipe. Same type of thing for the hops there. So in order to migrate your data from one to the other an easy way is to just migrate your existing data. So there's nothing wrong with taking exactly your data. You'll need some sort of script to transform that
29:00
into a MongoDB friendly document but essentially you can keep the exact same structure. So you can see here that I've got my beers document and it contains the information exactly as I would in my relational database. And once you've got that you can actually use the dollar lookup to join tables.
29:21
So as I mentioned it is possible to do the equivalent of an inner join or left outer join. I'll mix those my joints there but you can do the equivalent of a join from in a document database. So this is the code to do a look up. So I'm doing a lookup from beer grains
29:41
and then I specify my foreign keys and I specify the field as which it should be imported. And now my result is something like this. So I've got all of the fields from my original document but I also, in addition to that have an array for all of my grains. Now you'll notice here that I have still grain ID one.
30:01
So I kind of need to do a second lookup inside of my lookup in order to get the name of the grain right here. So what you can do as part of a lookup is add another aggregation pipeline inside of your first stage. And then you can import the values from the other tables. So now I get something like at the right
30:21
where I just have the name and the quantity of the grains as part of my grain array. This can get very quickly out of hand though. So this is my actual lookup. You can see the first two columns are only the lookup for the grains part. And then I do the same thing for hops
30:40
and then I've got another lookup which is a little bit simpler for the instructions but this is all the lookups that I would need to do in order to get that single document. So one of the things that you should start doing is to just embrace a document model and start embedding those different fields. So we can see that every time that I need to extract that information I kind of need to fetch all of those fields
31:02
but again, data access together should be stored together. So this is my UI, this is what it looks like. So what's on top, I've got the different names. We can see the names, the ABV is listed there. I've got a view recipe link, so I would need the ID. So basically the only thing I need here are all in the same table.
31:22
I need the name, I need the ABV and I need the ID so I can link back to it. So in order to do that, I can just query my database very similar to what I would do with my SQL database. So just get from the collection, find everything matching is that equals true and then just return me the name and the ABV.
31:40
The ID is always returned unless explicitly told not to return it. So I'm not returning the whole object. I'm just returning just the essential data that I need. So this is my query to get only those fields. So that's the equivalent of doing select name ABV, ID from beers, where is that equals true, right?
32:03
So I can get those different documents and I can see them listed. So this is how it would look like in the UI. Now I've got my recipe, I click on view recipe and this is what I wanna see. So I've got the name of the beer at the top. I've got the author of that specific recipe. I've got that all of my grains that are listed there,
32:22
all of the hops and all of the other ingredients if needed. Then I also have a link to the author. So I'll need the author ID in there. So, well, we already had that huge lookup that we've experimented with. Now I don't wanna do that every single time. That would be kind of a waste of time and energy. Plus, I mean, it kind of slows down your query.
32:42
I mean, you have to do that lookup and extract all of that information from all the tables. So you're not getting any benefits over a relational database at this point. So one thing you can do is to perform that aggregation pipeline and then just insert all of those documents into a new document. So I'm just doing that aggregation pipeline
33:02
and then I fetch my new collection and just do an insert many with the result from the previous aggregation pipeline. And what I get now is that all of those documents look a little bit like this. So I've embedded all of those documents, all of those different fields are now embedded and you can see that in addition to name is tapped and ABV, I also have a grains array.
33:22
I have a hops array, I've got my recipe array and each one of them contains all of the different information. So you can see here that in my grain table, I've got the description and quantity kind of like I showed you earlier. I also have my hops array and you've got the names here.
33:40
Now, one of the thing I know, I know what you're thinking now. It's like, oh my God, data is duplicated. I've got those fields centennial, which is there two times, right? And then one of the thing when you start playing with document databases is start embracing duplication. I know it sounds really weird. We've been told in school never to do that.
34:00
However, storage is cheap. Storage is actually a lot cheaper than us software developers, right? So use that. There's nothing wrong with duplicating some of the data. So for some fields that actually makes sense and it can actually turn out to be very useful. So say they decide to change the brand name of a specific grain, it actually might be useful to still have the original name
34:23
as part of your older documents. So there's a lot of benefits. You can ensure that the data is always consistent by still having that grain stable to populate a dropdown, but then save the full value of it inside of your object. So you won't need to do that additional query
34:40
to that other table to get that value. So do you really need relationships? A lot of the time you don't really need them. You can actually just duplicate that data and it works perfectly well, even sometimes even better. So use that table to fill your select and that's the only usage for having that separate table. And then just save the full text
35:01
inside of your information. So now I've got all of that mapped into a single object. I still don't have the author as part of my document. So here's one thing that I can do because I don't want to embed my full author. I want to have a specific author page, right? And that author page will have all the links. And I kind of don't want to embed that
35:20
into every single one of my recipes because that wouldn't make a lot of sense. That's a big object that I don't want to put into every single one of my documents. So sometimes things like description of a grain makes sense to embed. Sometimes a full object doesn't make sense. However, what I can do in this specific case is use the extended reference pattern.
35:41
So I've got this field that doesn't change very often. I mean, the author is always going to be the same into my, for my recipe or for my beer object. But I can still link to my author ID. So I'll have that link. So if somebody wants to see the author profile, they'll be, I'll be able to map back to it
36:01
or link to it. Then I'll be able to show all of the information. But I always show the name of the author on the page. Again, data stored together should be accessed, data accessed together should be stored together. So I'll just keep that name and I'll add that as part of my author object.
36:20
So I will be duplicating that field, of course. I'll need to make sure that if I change the author, I also change the name here. But apart from that, it's not something that should change very often. So it doesn't add a lot of complexity, but it avoids one of those expensive lookups. Again, one of those extensive expensive joins. And that's what I wanna do. So it really reduces the number of reads.
36:42
It really optimizes my database again. So that's a pattern that can be very, very powerful and very useful. And then I've got my author page, which has, of course, this profile picture. And I see here, see all two recipes. So those are the two recipes that I authored. Now, that too, how would I get that value?
37:03
Again, you could do a second query to make sure that you can count all of the beers that have author equals one, but that's not ideal again. We're doing that second extra call. So what I'm doing here is that I'm using a computed pattern for counts. What that means is that every single time
37:20
that I add a new recipe where author equals one, I'll tell my database to use some sort of automated mechanism to increase the number of beer authored in my author document. And I wish I could point to my screen, but that's the last field on the right there. So you can see here that beer author equals two.
37:41
I have programmed into my database a trigger. So whenever I insert a new beer, I'll look at the author ID and just increase the number for the specific author. If I delete one of the documents, I'll just decrease or decrement that value as well. So this computed pattern means that basically you're doing the computation as you write the documents.
38:02
And it makes it much easier for future reads. So you don't need to do an extra read or to perform a, again, expensive count operation on a specific document or collection. You can just access that because it's all part of the same document. And finally, I wanted to show this very quickly.
38:22
And well, it doesn't seem to be working, but we've just released literally two days ago at the MongoDB relational migrator. Take a look at it. I'll actually link, I forgot to add a link to that, but I'll link to it in my page at the end. But basically it's a tool that you can use
38:41
to map all of your current database, current relational database, and it'll kind of create that same copy of the database, but in the document model. And then you'll be able to specify which field you want to add as an array
39:00
of a parent document and so on and so on. It will kind of create those documents for you. And once you've got all of that schema determined and you have it ready to go, you're actually able to just click on the migrate button. It'll take care of migrating all of those data into a MongoDB database. Very, very useful, a very powerful tool that you can use.
39:22
And you can even do, make sure that your clusters are in sync or just do a one-time migration if you need to. And I think that brings me to the end of my presentation. I've lost my slides now, but basically different things to keep in mind when you're dealing with NoSQL databases,
39:42
make sure that you embrace duplication. It's perfectly fine to duplicate some data we don't have to care about storage anymore. We're not in the 1960s, which is why relational databases were created originally. So make sure that you can use those different patterns that you can really exploit to make your data better.
40:03
And I'll leave you with this single link, which is easy URL to slash un-SQLize. It should show up on the screen anytime now. So make sure to use, take a look at this specific link, at this one link, you'll have all the information, all the blog posts that I've talked about will be available there.
40:20
And that's all I had. So thank you very much for being here. Thank you. I'm so terribly sorry for not being there in person again, looking forward to meet you all in person sometime soon. And I guess I do have maybe one minute for questions, maybe one question, if any. Okay, anybody has a question?
40:43
Yeah. Can you hear me? Yes, absolutely. Great, perfect. My question was about the horizontal scaling. So there are like relational SQL databases that are horizontally scalable, like Cockroach, for example.
41:02
So what specifically about the document model makes it easier to build something that is horizontally scalable? Yeah, I think it's more and more common. You're correct in some relational databases. It is not something that is widely supported across all of the different database,
41:22
the traditional databases yet. So it is something that is getting more and more traction. They're definitely working on it. It's just, but it's not necessarily due to the document model itself. It's mostly because it's built in. It was thought to be scalable that way from the start.
41:40
So that's why it is definitely easy to scale that data. So you can really, yeah. So that's kind of the gist of it. One more, you? No, oh, okay. So I think that's it, Joe. Thank you very much.
42:01
I'm sorry for you, but, yeah, I'm sorry for you, but we have free ice cream now downstairs in the sponsor area and also cookies and coffee. But next time, next time. Absolutely. Yeah, thank you very much. Thank you. Bye-bye.