We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Solving the knapsack problem with recursive queries and PostgreSQL

00:00

Formal Metadata

Title
Solving the knapsack problem with recursive queries and PostgreSQL
Title of Series
Number of Parts
56
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Optimization problems are everywhere, from deciding which clothes to pack in our luggage (aka the knapsack problem), to selecting the tasks that will be worked during a sprint. Trying to solve these type of problems by hand is a tedious task often resulting in sub-optimal decisions. In this talk, we'll understand how PostgreSQL recursive queries can help. Starting from the proper problem definition, we'll then explore how to build queries that call themselves recursively, what are the risks associated with this approach and safeguards we can set to optimise performances. Finally we'll demonstrate how two new features released in PostgreSQL 14 enable an easier handling of the recursive statements. If you're into PostgreSQL and eager to understand how recursion works, this session is for you!
Musical ensembleMultiplication signMereologyOpen sourceOrder (biology)NeuroinformatikOnline helpReal numberDirection (geometry)View (database)Different (Kate Ryan album)CASE <Informatik>Mixed realityLimit (category theory)WeightTask (computing)Staff (military)Decision theoryMultiplicationConstraint (mathematics)Electronic mailing listOptimization problemQuicksortBitSubsetMathematicsComplex (psychology)DatabaseScaling (geometry)LaptopSeries (mathematics)Video gameNumberLetterpress printingFitness functionTotal S.A.Presentation of a groupPlanningCombinational logicAxiom of choiceMaxima and minimaPosition operatorFamilySinc functionUniform resource locatorFrame problemSpacetimeDisk read-and-write headStreaming mediaLikelihood functionDampingProjective planeTouch typingXMLUMLLecture/ConferenceComputer animation
Streaming mediaGoodness of fitDatabaseQuicksortVirtual machineInformation securityBitSet (mathematics)Power (physics)Ocean currentGame controllerRow (database)MereologyTable (information)Lecture/Conference
Query languageTable (information)NumberVariable (mathematics)WritingRecursionSelectivity (electronic)Row (database)Limit (category theory)Order (biology)Frame problemIterationStatement (computer science)Insertion lossRevision controlCodeFormal languageWeightOptimization problemSet (mathematics)CASE <Informatik>Complex (psychology)Lecture/ConferenceComputer animation
RecursionKnapsack problemWeightMoving averageWater vaporDemo (music)Poisson-KlammerConstraint (mathematics)Selectivity (electronic)Electronic mailing listRow (database)Order (biology)Position operatorStatement (computer science)CodeNumberCombinational logicMathematical optimizationQuery languageMultiplication signOcean currentResultantLattice (order)Total S.A.BitFunction (mathematics)Table (information)MereologyRecursionAxiom of choiceCountingLimit (category theory)Computer fontSeries (mathematics)WeightEvent horizonSet (mathematics)Different (Kate Ryan album)DatabaseReal numberCASE <Informatik>Disk read-and-write headTerm (mathematics)WorkloadPermutationType theoryOptimization problemLecture/ConferenceComputer animation
BlogLecture/Conference
LengthSet (mathematics)WeightRecursionQuery languageBlogCycle (graph theory)Computer configurationBranch (computer science)Axiom of choiceCASE <Informatik>LengthOrder (biology)WordDisk read-and-write headLevel (video gaming)2 (number)MathematicsFunction (mathematics)ResultantCombinational logicSelectivity (electronic)Lecture/ConferenceComputer animation
Cycle (graph theory)Multiplication signPoint (geometry)Function (mathematics)Numbering schemePerfect groupIdentifiabilityRight angleMathematical optimizationCASE <Informatik>Electronic mailing listOrder (biology)Query languageStatement (computer science)Intrusion detection systemLevel (video gaming)NumberLoop (music)CodeUniform resource locatorLecture/ConferenceComputer animation
Query languageRecursionKnapsack problemCycle (graph theory)Uniform resource locatorOptimization problemWeb pageLink (knot theory)Buffer overflowStack (abstract data type)Student's t-testCycle (graph theory)Query languageOnline helpRecursionLimit (category theory)Table (information)Instance (computer science)Multiplication signOpen setNumberQR codeOpen sourceBlogSharewareBitSet (mathematics)Computer configurationCASE <Informatik>Service (economics)Lecture/ConferenceComputer animation
Cycle (graph theory)Knapsack problemQuery languageRecursionMusical ensembleLecture/Conference
SpacetimeLimit (category theory)QuicksortRecursionStrategy gameMultiplication signComputer configurationLattice (order)Closed setField (computer science)Row (database)Table (information)Set (mathematics)FlagMathematical optimizationCombinational logicCategory of beingNumberCellular automatonScaling (geometry)CodeCASE <Informatik>Point (geometry)Lecture/Conference
Cycle (graph theory)Knapsack problemComputer configurationFilm editingPoint (geometry)Musical ensembleLecture/ConferenceJSONXMLUML
Transcript: English(auto-generated)
Thank you very much. And I'm really glad to be here in Berlin. It's my first time. And as I've been announced, I'm Francesco Dizziot, part of Ivan, which is a company that, if you have an open source data problem
and you don't find a solution, well, we can solve it for you. Today, we're going to talk about probably a problem that you didn't know that you had. Using some particular feature that you probably don't know that it exists, but possibly you
know about Postgres. So let's keep that simple. So I want to say that there will be some controversy in the talk. I hope that by the end of the talk, we will still be friends, because probably I will have a present for you. So let's start now.
Let's start by trying to understand what is the problem that we are trying to solve? What is the NACSEP problem? The NACSEP problem is a problem that everybody has when we need to pack for a trip. Why is that a problem, though? It's a problem because we have a NACSEP and we have a lot of items.
And each of the items has at least a couple of qualities. Let's take the socks as example. I don't like walking around where it fits. So I give a lot of value to have an extra pair of socks in my bag, in my NACSEP.
So in this case, value of 10. At the same time, the socks are really tiny. They don't occupy a lot of space. So they have a weight, what I call a weight, of 3. Let's move to the hat. The hat, I quite like my hat. Value of skyrocketing, 15. At the same time, occupies a little bit more space
than the socks, 5. Let's move to the trousers. I don't mind going out with the trousers, even if they have an extra scratch here and there. Also because I don't know about yours, but the trousers that I bought last time, they have a extreme weight of 15. Shoes, it's a mixed bag. Value of 8, weight of 10.
And the same goes with one of my favorite t-shirts, which dictates that I love pizza. Value of 7, weight of 10. So still, we have some sort of situation. Where is the problem? The problem is that, I'm sorry to tell you that,
but we are not Mary Poppins. Our bag is not unlimited. Our bag will have a limit. And let's say that in this case, the bag will have a limit of 20. So now this becomes a problem, because if you do the math here, we will not be able to fit all our items in the bag. We will have to take a decision.
And how you do this usually? Well, you take all your items, you put them on top of your bag, and then you start picking a subset of them. So let's try. Let's try with the trousers and the hat. Let's do a little bit of the math. And we say that we have a total value of 20 and total weight of 20.
So the next up is full now. Is 20 the maximum value that we can get? We don't really know until we basically check all the possible other combination. So let's do that. Let's empty the bag. Let's try with another combination. Socks, t-shirt, and hat.
Total value of 32, much more than 20. Total weight of 18. So the next up is not really full. But if you remember all the items that we had, there was none of the item with weight less than 2. So we cannot add anything else. Now the question, is 32 the top choice?
Again, we don't know. We need to find and try all the possible combinations. At the very end, what we are trying to do is we are trying to maximize the value, while at the same time keeping the weight under a certain limit. And if you've been traveling, this
is some sort of a problem that we always had. Companies were giving you a limit of 20 kilos on your luggage. If you were going over 20 kilos, well, bad luck, you had to pay. But I also have been in a situation where a special airline didn't allow me to go into the plane if my luggage was over 20 kilos
because they had limits on the total weight of laggages that they could hold. But now, if you think that this problem, it's only a problem about laggages and traveling, well, not really. Because all I said that this is a problem every time
that we have to maximize a certain value, and we had a limit on another possible value. And if we think about the reality of what we do in the real life, we can apply the same kind of situation in a lot of other use cases. Let's think about which tasks are we
taking on for the next print. In this case, what we are trying to optimize is the benefit of, if I'm closing a certain number of tasks, what's the benefit of those tasks to the overall project? The limited amount is the amount of time that I have in the next two weeks. Or now, let's talk about stock.
In this case, the limited amount is the money that you're willing to invest. What you are trying to optimize is the likelihood of those stocks to gain value in the next two weeks or two years, depending on your time frame. But then now, going back to travel, we touch only a little bit of the travel problem
because we were talking about bags and adding items to it. But travel is an optimization problem which is way complex because you have a limited quantity of money that you're willing to spend. You have a limited quantity of time that you have in order to visit some location.
And you want always to optimize the overall value, your overall experience in traveling. Now that I made you understand that you may have the next problem somewhere in your life, let's go back to the initial little problem. So you may say, Francesco,
I've been doing my bags since years. And you know, if you have only five items, that's an easy problem. I can solve it. I can try all the combination. And within 10 minutes, I solve the problem. And that's true. But you know, I live in a family with small kids, wife, huge family.
Usually our starting position is not like this, but more or less something like this. You have a huge amount of items that you want to bring with you. And now trying to find all the possible combination and the optimal value of those items
is not as easy as before. For each item that you are adding to your inventory, you are basically duplicating the amount of combinations that you have to check. But let's say, no, we don't have a lot of stuff. We can take a decision. Our stuff, our list of items is only five items.
But even here, if you live in the same physical world that I live, probably you will not have only one constraint. You will have at least, for example, two constraints. For example, both the weight and the size of the bag. And even more, now every item,
specifically if you are traveling with someone else, will probably not have only one value, but multiple values. Let's say that I value always five of my socks, but my wife, she doesn't give a lot of value to my socks. She would prefer to pick something else. And the same goes for all the other items.
Now the problem becomes a lot more complex to solve because you have multiple directions that you need to take care about the limit and also multiple directions in which you can optimize the problem. What are you going to optimize the problem for? The red, heart, or the green one, or a mix of the two?
All these are possible different directions that you are going to check in order to solve the problem. But now you may tell me, Francesco, probably you are overthinking. Having different values to items is not something real.
And now it's the worst part of my talk. I want to give you a real example of a case where this is a problem. Socks and sandals, yes. I believe there is a different view on socks and sandals depending on if you are in Germany or in Italy.
I'm not here to judge. I just wanted you to realize that this is a real problem. So I believe as of now, we solved the fact that the next problem is a problem that everybody has and also the fact that we need some help of a computer in order to solve it on scale.
So the next part of the talk is, why should we do that in Postgres? It's a database. Why should we use Postgres to solve this problem? Also because maybe we have our inventory data in Postgres already, but usually when we face this kind of complex problems, we take the data from the database
and we solve it somewhere else. Is that a series of Python notebooks? Is that a new machine learning, some sort of tool that our company just purchased? Usually we take all the beautiful data that we have in the Postgres database and we do a CSV extract and we load somewhere else.
The problem with this approach is that once you move the data out of the database, you lose all the good security bits that you put in the database to secure all your data. So all the privileges saying which user can access a certain table, a certain column of the table
or a certain part of the rows are gone. Even more, once you start extracting the data from a database, well, you are giving the power to a certain user. And we never had any problem with a certain user by mistake sharing a huge data set
in a public S3 bucket. That never happened, right? That's the problem. Once you move the data away from the database, now the data is wherever. You lose control. Even more, if you move the data out of the database,
you lose what is the notion of what's current and what is stale. Because if you take a CSV, you move it somewhere here, now you have your beautiful data scientists that work on your data. They don't really know if the data set was extracted five minutes ago or three months ago. You could have people running on stale data for a long time
and create the wrong conclusions. So my suggestion is, don't move out the data. We have a beautiful Postgres database with a beautiful engine that allows us to do a lot of queries. So let's now move and try to understand how we can solve the problem with Postgres.
And the initial solution could be, you know, have my inventory table. I can do a select and I select the first item. Then I do another join with another select from the inventory table, and then another join with another select from the inventory table, so on and so forth.
And this works. I mean, if you select one item, another, another, another, another, you end up with a correct solution. The problem here is that in this case, you are dictating how many items
are going to compose your solution. So if you have a complex item set, huge item set, and you're trying to find the optimal solution, if you only allow six items to be in the bag, well, you know, you could find the sub-optimal solution because the perfect one could be by including seven items instead of the six.
The problem with a solution like this is that it doesn't work because we want not to put any limit on the amount of steps that we are doing in our selection, but only on the weight. If we have a variable number of iterations, a solution where we fix the number of iteration is not going to work.
So we need to change our approach and we need to start using the recursive queries. What are recursive queries? Well, recursive queries, as this person, holding the problem in a frame. When you apply recursivity in order to solve a problem,
what you do is you check the frame, check the problem, cut down the problem in pieces, and pass it to yourself a smaller version of the problem in the same frame. Then the next step, what do you do? You do exactly the same. Check the frame, check the problem, cut the problem in pieces, pass it to yourself
a smaller version of the problem. And on, and on, again. Until the problem is so small, is so easy to solve that you can do it. And then you pass back up the solution of your problem until you solve the gigantic problem. This is recursivity in any kind of language.
Now let's talk about how to solve this in Postgres. First of all, we need our table. So we had the initial set of items and I wanted to provide you all the code that you needed to recreate the example yourself. So I created a table inventory. For each item I have the ID, the name, the value, and the weight.
And then I have my insert statement in order to insert the relevant rows. Now let's talk about how to write recursive queries in Postgres. And to me, when you write a recursive query, it's more or less like jumping from a cliff. You have to state a couple of things.
First of all, you state where you stand initially. And then I don't think I will do a live demo because that could harm myself. You have to say how you want to roll until you hit the water or whatever is down there. I have, yeah, the other person that can take my place, but let's not use it. So you set the base and then you set
how do you roll over? How do you recurse on your data? Let's see the code now. So the first thing that we need to say is a with statement where we tell Postgres, look that I'm writing a recursive query. So this already gives Postgres a hint called items.
And I'm saying this query will contain two rows, sorry, two columns. Picked items will be a list of the items that I will pick in my journey. A number of items will contain the count. Now let's tell what is the starting position, our cliff. Our cliff is, what we are doing is,
we will pick one item from our inventory table and we will create an array with only the first item that we will pick. The second thing that we will do is we set, first selection is one, I'm selecting one item. Where I'm selecting the item from?
From the inventory table. And in order to start with a soft start, sorry, I'm starting selecting only the socks. So socks will be my first choice. Now that I set my starting position, let's do the rolling part, recursivity part.
So I do union all and there is an interesting thing. There is a select statement doing a select from inventory, which was our original table, cross join items. But we didn't define any items table. If you look really well here, what items is,
it's our recursive select statement. So here is where we tell Postgres, look that this result set is joined with itself. Now that was all this mystery, what are we doing every time we're rolling? Well, we are adding the item, the current choice,
the item that we are choosing now, to the list of picked items. And we are adding plus 1 to the number of items. So we are basically saying, before we had socks, we now add the hat, for example. And now since we set both the starting position
and the rolling bit, we can close the bracket and say, select start from items, our recursive query. And let's say we want to retrieve all the possible combination of three items, where number of items equal to 3. If you now take this code and run in our Postgres database,
this code will run. But we'll run, run, run, run, run, run, run, run, and run. And we'll never finish. Because in here, there is a problem. We set our starting position. We set how to roll. We set our NAND clause outside our main query.
But we didn't tell where the order is. We didn't tell where we have to stop rolling. So what we are saying here is, let's take this simple. We add a WHERE clause. And we say, well, the total number of items is less or equal than 3. Now, if we take this query, we
run it against the Postgres database. Beautiful. We get 25 rows. Those are all the possible combination of three items starting with SOX. That was our initial choice. Is everybody happy?
Did we solve the NACSAP problem? No. Why didn't we solve the NACSAP problem? Well, because we didn't add any of the NACSAP constraints. What are those? Well, first of all, if we choose SOX, we cannot choose SOX again. We cannot choose again the same item. Again, your definition of the problem might be different.
You may have five different pair of SOX. In my case, I have only one. Second constraint is not really a constraint. But we are trying to optimize the overall value. We didn't even calculate what was the value of the solution. Third and real constraint, wait, we cannot go over 20. That's a limit.
So let's rewrite our query in order to take care of the constraints. First of all, with recursive bit, as you can tell, the font is getting smaller. So we will have a little bit more code. So always recursive items. The first three rows are more or less the same as, sorry, three columns are more or less the same that I had before.
I have the picket items, the number of items. Now I'm also adding the item ID. That is the ID of the last item that I will choose in a series. And then I'm adding other two columns for the total weight and total value. Now let's define again our cliff, our starting position. In this case, what we are going to do is
we are selecting the item ID. We are creating an array with the item name and one as number of items. Plus, we are selecting the current item value and weight as the total value and total weight. If you put one item in the back, that's the total weight and total value of the bag itself.
Now the rolling bit. Every time we roll, we select a new item and we add the ID in the first column. We add, we concatenate the item name to the list of picked item, and we add plus one to the number of items. Even more, we add the total weight and the total value,
the weight and the value of the currently selected item. And we always do our lovely cross join with inventory and items. Now let's add the constraints. We add a couple of where closes. The first where close is using Postgres array functionality
to check that an array created with the item name of the last item that we picked is not contained in the list of picked items. This way, we are not allowing picking socks two times. Then we are saying, look that you have to calculate
that the total weight of the items of your previous selection plus the current item needs to be less than 20. It was our constraint. If we now close the bracket and we do the same select statement on order by the total value, what we get is this.
Beautiful 33 rows going from the worst selection possible, selecting only the trousers. I don't really like it, to the optimal shoes, hat and socks, three items, 18 of weight, 33 of value. We solved the NACSA problem. Can you spot something still possibly wrong in here?
Exactly. We have the last six rows, which are exactly the same three items, just in different order. So if we look at this in detail, it's always socks, shoes and hat in different order.
Now this is specific for the problem that you are solving because in my case, I don't care if I put the socks, the shoes or the hat or the socks, the hat or the shoes. Probably you should care because putting the shoes on top of the hat is not something that I would like to do,
but let's say I don't care. In traveling, for example, if you have to go from Milan to New York to Rome, that's not optimized. If you go from Milan to Rome to New York, that's much better solution. So depending on your problem,
the order of the events could be something that you are caring about or something that you don't care about. In my case, I don't care. So how can I get rid of all the permutation? How would you do that? How would you solve this problem if you had to pack your items?
Well, probably what you would do is you would check your wardrobe and you would order like, I believe I'm reasonably decent at managing my stuff. I would order all the items either by item name, by color, by item ID, if I attach the label, I don't know how much compulsory
you want to control your stuff in your wardrobe. But we can do much the same thing with our query. We can order all our items by item ID. So this is the recursive bit. And what I'm saying here is the item ID
of the next choice must be greater than the item ID of the last choice that I already made in the previous step. So what we are doing is, if we have a wardrobe always ordered, we always force ourself to pick from the right side every time. But if we do that, well, you know,
we can also delete the first statement because we are never going to reselect the same item. If we take again this query and run it, now we have the optimal solution. 15 rows, we have no duplicate, we have no more different orders, and still the best solution is how our socks,
hat, and shoes always weight 18, total value 33. And you know, we solved the next problem with Postgres and recursive queries. Is everybody happy now? Are we still friends? Yes.
The problem that I had with this solution is that, you know, we are 30 minutes in the talk and I got 40 minutes allocated to me. And you know, I cannot stop after 30 minutes. But something that initially I look at a bad thing
now can work with me because I wrote a blog post about this solution in September last year. And I believe I have enemies with the Postgres people that are doing the release because literally two weeks after I wrote the blog post, they came out with Postgres 14. And they added two features that changed the way
that we use to write recursive queries. Those two features are called search and cycle. So I believe it's a good idea to check what's new and understand those two features because they could help you. Let's start with search. Search, as the word says,
basically allows us to define how we browse the possible three of options. If you check, this is the possible three of options if we select socks as our first choice. So with search, we have a couple of options. The first option is to say, I want to look near before looking far.
I want to do first, I want to evaluate first all the options doing one step before evaluating all the options doing two steps. This is called breadth first. And basically in our case, the first choice will of course be selecting socks, but then we will do near before far. So we will select socks and hat,
and then we will select socks and trousers, then socks and shoes, socks and t-shirt. Now we finish all the one step away solution, we need to do two steps. So we do stocks, hat and trousers, and then we scroll all the rest of the options
until we finish them all. This is the breadth first, again, searching near before searching far. How do I write the query? Well, the overall query doesn't change. All I need to change is at the end, I can say search breadth first, and I define which is the ordering by item ID.
And then I can parse my result with the order column that will be generated. If we take this query and we run against our data, we will obtain something like this. As you can tell, I start looking near before looking far.
The first is I only check socks, and then add the second item, socks, hat, socks, trousers, socks, shoes, socks, t-shirt. And then two levels, socks, trousers, hat, shoes hat, t-shirt hat, and so on. What is the opposite?
The opposite is trying to go as far as possible before checking all the other. So it's called depth first. And in depth first, at the beginning, we will select only the socks. Then we try two steps, socks and hat. Then three steps, socks, hat and trousers. Four steps, socks, hat, trousers and shoes.
And then we finish. In this case, I limited to three options, to three choices. Then we go to the neighbor. We do socks, hat, trousers and t-shirt. Then we finish also this branch. We go to the nearest branch. And so on and so forth until we evaluate all the possible combinations.
Again, if we run the query with depth first, we obtain a slightly different result where you see I do socks, then add the hat, add the trousers. Then I finish at level three. I check the neighbor with shoes. I check the neighbor with t-shirt. I finish the sub-three.
I go one level back with trousers and then away. I'm going forward again. How the query changes? The only change that you need to do compared to the previous one is depth instead of breadth. There is nothing else. Now I want you to spend a couple of seconds
reviewing the output of the queries that we saw. This was the output of the query using breadth first. Check the order column. This is what is automatically generated by Postgres. What we see here are two numbers.
The first number is the level. But the level is more or less what we calculated with the number of items, minus one. The second item is the item ID. If we now move to depth first, we see that order column now is completely different.
But this is an array concatenating the item IDs, more or less what we had in the list of picked items, only with item name. What I wanted to say here is that nothing of this was impossible before. It's just that now with search,
we have a more elegant way of defining how we are parsing the three of possible solution. And basically, this is all about search. Let's pass now to cycle. What is cycle? Well, a cycle is when you close a loop.
And depending on the use case, cycle could be interesting or not. Let's tackle a couple of use cases. The first one is about traveling. Let's say that you are an American coming to Europe and you want to do a trip across Europe. What you want to do is, of course, you arrive in Rome. Then you go to London, Paris, Oslo, Helsinki.
And then you are wondering about what is the next stop. What you don't want to do is to go back to Rome because there is no point in passing two times by the same location. So in this case, cycles is something that you will want to avoid because it's losing time. Let's check another example.
Everybody knows Bob. Bob gives some money to Maria that gives the money to Karen that gives the money to John that gives the money to Luigi. Of course, when there is an Italian, Luigi goes but gives back the money to Bob. This is the perfect scheme of money laundering. So in this case, cycles, if you are the police,
is something that you may want to check and look deep into. So depending on the use case, cycles is something to go away from or something to look into. But you know, even in our NACSA problem, we were handling cycles
because for us a cycle was picking socks two times. And we were trying to avoid that initially by writing the array function by checking that the item name was not contained in picked items. Then we wrote the optimization saying, you know, I order my wardrobe and I always pick from the right.
Now with cycle, again, we have a more elegant way of saying this, which is cycle using the item ID. So the item ID is the unique identifier of a specific item. And please create a column is cycle that will be true or false telling us if there is a cycle or not using item IDs that is a support column.
If we write the statement and we run the query, what we get is something like this, where item ID and picked items are coming from the query before. Is cycle is a true or false telling us if there is a cycle or not. And item ID, again, is the concatenation of items, pretty much what we did with picked items.
And we can immediately spot that socks socks is a cycle. Socks, hat and socks, socks, hat and hat, socks, trousers and socks are all cycles. So again, what we were doing before was not far from what Postgres is doing automatically.
But before we had to write our own code, now we have a more elegant way of writing down or how to detect a cycle or not. I want more or less to conclude because time is almost over. Reviewing what we learned today. I'm sorry to say, we all share the same NACSA problem.
Even if it's not about traveling, it's about taking the issues in JIRA. It's about deciding what to do tonight. All this is an optimization problem that we may want to solve using Postgres. I'm not saying that every time you want to book a table for a night,
you go in Postgres and you insert all the data in Postgres, but you have always this option. Then we understood how to write recursive queries and why they are useful because we are not forcing any limits which are not in the optimization problem itself. We are not forcing the number of steps
to arrive to the solution. We let the optimal solution come to us. We understood how to write recursive query with the cliff and the rolling bit, the base and the recursion. And then we look at Postgres 14 search and cycle options. I want now to leave you with some additional references
that you can find by scanning this QR code or with the URL at top. You will see a page with four main links. The first one is the NACSA problem itself. It's the Wikipedia page talking to you about the problem itself. It's where I started my journey.
Well, actually, I didn't start there. This old talk, the blog posts which are behind it and old rest are coming from a Stack Overflow question. So you can find, at least I found, enormous value in going to Stack Overflow and checking what other problems that people have. This case, probably it was a student that needed some help to solve an assignment.
Still, we can see that a problem of a student can be related to all of us. The second link is the blog post that I wrote talking about this solution, but then Postgres 14 came out a couple of weeks later and I wrote a new blog post talking with a different example, a traveling example,
how to include the cycle and search features. This last one, and if you are still friends, this is where the surprise comes. It's ave.io. It's a company that I work for. As I said, we solve all the open source data problem. We offer a set of many services, including Postgres, Kafka, Flink, Open Search.
You basically have to start up the instance and we take care of that. We offer an amazing $300 and one month free trial that you can use. If you're still my friends, come to our booth and we give you extra $200 so you can enjoy your journey better.
I hope this session was more useful for you. You talk something about recursive queries, backing, socks and sandals. I'm here for all the questions that you might have. Thank you very much.
So thank you, Francesco, for the interesting and entertaining presentation. Who wanted to be the first? Really wonderful talk. Thank you. Have you experimented with constraining the search strategy at all to sort of limit the search, prune the search space?
So the beauty of recursion is that you don't set any artificial limit to the solution. The problem with recursion is that you don't set any artificial limit with the solution. This means that recursion could easily get out of hand. We are talking about a complex problem.
At the inventory, if you check the code, we are doing a cross join with the table. This means that every time we are joining the table, we are matching every row with every other row. This could explode really quickly. So how do you optimize this? Again, if you want to achieve the best overall solution,
you shouldn't, because if you optimize, this means that you are taking away possible combinations. Still, this problem is the same problem on a larger scale that you have with Google Maps when you want to find a short path.
What I believe they do, I believe so, yes, is that they somehow take some of the options out when they don't make sense anymore. So if you have to go from Milan to Rome, there is a nice highway, or there are a lot of smaller streets around,
but at a certain point, adding smaller streets will not make sense anymore, so you start deleting. What you do with Postgres, in this case, you add work closes. Let's say what you could do is, you have all your items set, but the item set could be divided into categories. You could have the category t-shirt and the category trousers, and you could say,
I want to pick one item of each category. So in that case, you are reducing the number of joints or cell joints and the number of possible combinations, because it's all nice and easy when you have five items, but when you have million rows, if you do a cross join with other million rows, that explodes really quickly.
So all the possible optimizations and solutions to this are by design. There is no magic flag in Postgres that will solve your problems. You have to go back to your design team, to the people analyzing the problem and say, okay, we cannot solve this in an decent amount of time.
We need to restrict the field somehow. Yes, I believe depending on the size of the problem, you may find the overall optimal or you may need to, at a certain point,
cut out options that start don't make it sense. Thank you, Francesco. Let's thank him together. Thank you very much. Thank you.