We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How to conquer the world

00:00

Formal Metadata

Title
How to conquer the world
Title of Series
Part Number
98
Number of Parts
169
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Rogier van der Geer - How to conquer the world The popular board game of Risk has many fans around the world. Using a Python-based simulation of the game, we use a genetic algorithm to train a risk-playing algorithm. ----- During this talk we'll explain what genetic algorithms are and we'll explain an entertaining use-case: how to win at popular board games. During the talk we'll demo how object oriented patterns help with the design and implementation of these algorithms. We'll also demonstrate a library that allows users to push their own risk bots into a game and battle it out on.
Mathematical analysisAlgorithmGame theorySoftware frameworkWordComputer simulationElectronic mailing listView (database)Lecture/Conference
Rule of inferenceMultiplication signWhiteboardDifferent (Kate Ryan album)Computer animationLecture/Conference
Level (video gaming)Uniformer RaumQuicksortShared memoryLine (geometry)CirclePoint (geometry)Divisor1 (number)Game theoryGreen's functionWhiteboardLecture/Conference
Level (video gaming)Game theoryWhiteboard1 (number)Endliche ModelltheorieUniformer RaumBitArmCombinational logicEntire functionArithmetic meanPlastikkarteMechanism designTerm (mathematics)AdditionFormal grammarComputer animationLecture/Conference
Software frameworkGame theoryPlastikkartePlastikkarteMechanism designSoftware frameworkTrailTerm (mathematics)Game theorySocial classElectronic mailing listDecision theoryWhiteboardReduced instruction set computingComputer animation
WhiteboardPlot (narrative)Set (mathematics)WhiteboardLevel (video gaming)Object (grammar)Product (business)Right angleWater vaporGame theoryPlotterCASE <Informatik>Lecture/Conference
Plot (narrative)Metropolitan area networkMaxima and minimaMountain passDedekind cutInclusion mapComa BerenicesConditional-access moduleGreen's functionFunction (mathematics)SicArc (geometry)WhiteboardElement (mathematics)NumberDescriptive statisticsElectronic mailing listAxiom of choiceAdditionFunctional (mathematics)Set (mathematics)Lecture/ConferenceXML
Wide area networkContext awarenessMaxima and minimaPersonal identification numberCASE <Informatik>Representation (politics)NumberDescriptive statisticsCASE <Informatik>Game theoryGreatest elementPolarization (waves)Default (computer science)Lecture/ConferenceXML
Game theoryObject (grammar)RandomizationPlastikkarteFlow separationInteractive televisionLevel (video gaming)UsabilityDecision theoryTerm (mathematics)SummierbarkeitRevision controlLecture/Conference
Level (video gaming)Game theoryStatisticsWhiteboardPlotterLecture/Conference
Single-precision floating-point formatPlot (narrative)WhiteboardSystem callTerm (mathematics)ArmFunctional (mathematics)Computer animation
Game theoryPlot (narrative)Amsterdam Ordnance DatumPhase transitionElectric currentWhiteboardObject (grammar)Phase transitionHypermediaGame theoryLecture/Conference
Execution unitGame theoryState of matterForcing (mathematics)Point (geometry)Lecture/Conference
AlgorithmMachine learningNatural numberTime evolutionString (computer science)Function (mathematics)Right angleCASE <Informatik>Strategy gameTwitterTotal S.A.AlgorithmForcing (mathematics)Natural numberVirtual machineDimensional analysisNoise (electronics)EvoluteString (computer science)Functional (mathematics)BitDistanceBlock (periodic table)Natural languageLecture/Conference
Heegaard splittingCASE <Informatik>SurfaceVarianceRight angleBitLecture/ConferenceXML
Image resolutionLevel (video gaming)BitState of matterMathematicsLecture/ConferenceComputer animation
Optimization problemFunctional (mathematics)CASE <Informatik>Performance appraisalString (computer science)AlgorithmRandomizationNatural languageNewton's methodLecture/Conference
Game theoryGame theoryString (computer science)BitFunctional (mathematics)Element (mathematics)Computer animation
CASE <Informatik>Multiplication signGame theoryCombinational logicAlgorithmBit rate1 (number)Lecture/Conference
RankingGame theoryRankingRight angleDistribution (mathematics)Cartesian coordinate systemComputer animation
Distribution (mathematics)PlotterAlgorithmGame theoryCategory of beingInstance (computer science)Lecture/Conference
PlotterDistribution (mathematics)Game theoryGoodness of fitMathematicsComputer animationLecture/Conference
Game theoryOrder (biology)Distribution (mathematics)Right angleDistanceLecture/ConferenceComputer animation
Default (computer science)Letterpress printingLipschitz-StetigkeitGame theoryPersonal area networkRankingPhysical lawGame theoryBit rateGroup actionDistribution (mathematics)Point (geometry)Default (computer science)Clique-width2 (number)Computer animation
Game theoryGroup actionMultiplicationType theoryPerformance appraisalNatural languageElectronic mailing listPlastikkarteState of matterComputer hardwareLecture/ConferenceComputer animation
NumberMeasurementElectronic mailing listTotal S.A.RankingLine (geometry)Lecture/Conference
Unitäre GruppeRankingMaxima and minimaRankingWeightMultiplication signNumberInsertion lossSubject indexingBit rateComputer animation
Probability distributionWeightRandomizationGame theoryBit rateLecture/Conference
Game theoryNegative numberIterationWeightNumberMultiplication signTheory of relativitySound effectPhase transitionLink (knot theory)AlgorithmComputer animationEngineering drawing
Bit rateDivisorEstimatorPlotterMedical imagingIterationWeightLecture/Conference
WeightResultantFunctional (mathematics)IterationNeuroinformatikGoodness of fitObservational studyMultiplication signEngineering drawingDiagramLecture/Conference
Multiplication signNeuroinformatikDistribution (mathematics)Combinational logicWordPlanningComputer animationLecture/Conference
Representation (politics)AlgorithmOpen sourceGeneric programmingMultiplication signBlogRepresentation (politics)AlgorithmRepository (publishing)Natural languageCodeLink (knot theory)Strategy gameMoment (mathematics)Term (mathematics)Medical imagingSoftware developerStress (mechanics)Overhead (computing)Distribution (mathematics)Scaling (geometry)Theory of relativityAreaStatisticsMereologyRandom number generationDifferent (Kate Ryan album)IterationArtificial neural networkBit rateContent (media)ImplementationObject (grammar)Interactive televisionNumberGame theoryAbstractionComputer architecturePhase transitionoutputLevel (video gaming)Library (computing)Sound effect1 (number)Axiom of choiceDecision theoryRule of inferenceSemiconductor memoryReading (process)SoftwareQuicksortHeegaard splittingForcing (mathematics)WhiteboardReduced instruction set computingInterface (computing)Probability distributionRandomizationFigurate numberDot productMetric systemError messageComputer animationLecture/Conference
Transcript: English(auto-generated)
Hi everyone, for our next speaker, it's Rochier, sorry, Rochier from De Geer, he is with GoDataDriven in Amsterdam, and if you ever played Risk, he will get some nice insights how you can actually see how to conquer the world with a simulation.
So please welcome. So indeed I will tell you how to conquer the world, at least in a tabletop version, and I will tell you how you can use genetic algorithms to actually improve your Risk skills.
So, in other words, we'll be doing some Risk analysis. An overview of this talk, first I'll just introduce the game of Risk to you, for the people that haven't played it before or just not recently. Of course, if you want to analyze the game of Risk in Python, then you need a Risk framework to do so, to actually play the game.
So I'll show you how I implemented Risk in Python, then I'll tell you what a genetic algorithm is actually, and then we'll see how you can use a genetic algorithm to play Risk. So let's start with the game of Risk.
A little history, it was invented in 1957, so it's over 50 years old, as the conquest of the world, but since two years later, since 1959, it's been called Risk. And according to the publishers, you can play it with anywhere between two and six players, but I'd say you should need at least three, because for two, the rules are different.
And the playing time is anywhere between one and eight hours, depending on how long you take to decide what you're going to do. I can tell you in Python this goes a little quicker. So, okay, everything revolves around this game board. What you see here is just a map of the world, divided into six continents, just like in reality, and 42 territories.
And so these territories are divided by borders, and so they're neighboring territories if they either share a border or are connected by a red line that goes over the sea. So now, when you prepare the game board, first you divide all the territories over the players that you start with.
So imagine we have four players, we'll divide all the 42 territories among those players. So we assign them to, what, we have red, blue, yellow, and green. And first we put one army on each territory, and the ones in the circles are here to show that there's one army on every territory.
And then every player, one by one, gets to place one more army on one territory, wherever they want to put it, up until the point that they all have 30 armies on the board.
So that looks like this. So after that, the game is ready to be played, and then, so each turn consists of three stages. First there's the reinforcement stage, then the combat stage, and then comes the fortification. So what happens during these stages, well first, during the reinforcement, a player places some additional armies on the game board.
And this really depends on his status in the game, how many armies, or how many territories does he control, and which ones. So first of all, he gets one army for every three territories he controls, plus he gets some bonus armies for owning a full continent.
So for example, if you own the entire continent of Europe, you get five more armies. And then there's some game mechanics that I won't explain in detail, that is called reinforcement cards. You can earn these by claiming another territory, and you can get a little bit of additional bonus from that.
So after reinforcement, you can attack other players. So this is the combat stage. So what you can do is you can take some armies from one territory, and then attack a neighboring territory with these armies. The battle is decided by using dice, so by chance. And basically you only have a good chance to win if you have more armies.
Really it's only about having a lot of armies. And then, so if you win, if you conquer at least one territory during this stage, you get one of these reinforcement cards. And then after you've attacked, you can attack indefinitely during your turn, until you've had enough, or you run out of armies to attack.
And then the last stage of the turn, which is a little less interesting, is fortification. This is where the player moves some armies, so you can pick one territory and a neighboring territory, and move a few armies between those. You can only pick one combination of territories, there have to be neighbors, and then you can place as many armies as you have there.
So that summarizes the game. So, well not really, because then there's the winning of the game. So each player gets a mission, and by completing the mission, the player wins the game. Of course, if you just eliminate all other players, then you win.
But there's also the missions, like listed here, there's quite a few. I haven't listed all of them, but for example, there's missions like Conquer Africa and North America. So conquer two continents, destroy another player, or conquer at least a certain number of territories.
So, now, let's move on to RISC and Python. So I've built a RISC framework in Python that consists of five classes. So they're listed here. First of all, the board, which handles all the armies on the game board, and all the territories which connect to each other. Then there's the cards, which handles the reinforcement cards of a single player.
Then there's a mission, which describes a mission of a single player, and can also check whether this mission has been achieved. Then there's the players. Of course, these are the actual players that need to make the game decisions on what they're going to do to win. And then in the end, there's the game that keeps track of all the other game mechanics,
like the turns and all the stages in the turns. Okay, so let's have a look at the first object, the board. So when we, we can just import the board from the board file, and then we can create it. And when we do this, we have to specify how many players we are going to use,
because it already randomly distributes the territories amongst the players. So then we can call the plot board method from the board, and it will actually show you where the territories, which territory belongs to whom, and where the armies are. Of course, this initializes only with one single army in every territory.
Then, so the board allows for very easy manipulation of the armies on the board, so you can easily change the owner of a territory. We can also give it to an owner that's actually not yet in the game.
For example, here, I set the owner of territory zero, which happens to be the one right here, and set it to player ID five, which is the black player, and then I place 15 armies on there. And then if we plot the board, we can actually see that this is the case. This is an easy way to cheat.
So the board is also aware of the layout of the territory, so let's move back to this picture here. So around the territory, which has 15 black armies on it, there's five neighbors, of which three are owned by red and two are owned by blue. And so if we ask the board, we can ask for the neighbors of territory zero,
and it will tell you that there are indeed five neighbors, of which three are owned by player zero, that's red, and two are owned by player one, that's blue, and all of them have one armies on them. So this makes it easy to decide, for example, which armies, or which territories you're going to attack from where, because you can just ask the board, which are my options.
Okay, so it can also handle attack moves, so you can just call the attack method from a certain territory to another territory. Of course, this has to be a territory that's owned by someone else with a certain number of armies, and it will throw the dice for you, and then if you're lucky, like in here, you can take over another territory.
So like here, the black player moved into the Middle East. Of course, that was easy with 15 armies against one. So then there's also the missions. We can get all the missions with the mission function, also here we have to specify the number of players,
because this depends on the number of players, and here there's a list of all the missions, so each mission has a description, there's quite a few that ask you to conquer a certain set of continents, some of them ask you to conquer two continents, and an additional continent of choice, and you have to just own these two, plus an extra, whichever you choose,
or own a certain number of territories, sometimes even have two armies on each territory to win, and then there's the last missions that ask you to eliminate another player, and of course this depends on the number of players, if we had five players, then there would have been one more eliminate another player mission.
Okay, so we can just take a mission, and it will give you a representation that tells you what the description of the mission is, and to whom it's assigned, by default it's not assigned, and we can assign it to a player, so for example if we assign it to player ID 0,
then it will tell you it's assigned to red. And then we can ask it whether the player has won the game already, and we can have the mission evaluate the board, and check whether the player has actually won the game, which in this case isn't true. Now there's one special case,
so that is when the player is assigned to kill himself, so for example if we take the last mission here, this is eliminate the yellow player, which is so far unassigned, we can assign it to the yellow player, which is player 3, and then it will tell you, well there's a fallback mission, and we'll have to conquer at least 24 territories, which is now assigned to yellow.
Okay, so that all works, then there's also the players of course. Well, I'm not going to detail into how the player object works for now, because I made several versions, we'll encounter some of them later in the talk. But the players have to implement four methods,
and that is one method for handling one of the game stages, so the reinforcement, the attack, the fortification, and then the turning cards for the reinforcement cards. And then, well of course there's the game object,
instead of just explaining what it does, let's just go through a full game. I've also implemented a random player object, which just plays the game randomly, so if you think one of the players makes a stupid decision, well it's probably because it's just random.
So we can just import the game, we need to also provide it with a few players, so let's create a game with four random players, and then we can plot the game like we used to. You can also see that right now it plots some statistics that are probably not easy to read from there, but these are not so very important.
So the game board has been divided up amongst the four players, and now the first stage is to put armies on them. We can tell the game board to initialize a single army. This asks every player to place one army on the board.
And you can see that, so for example, here there's a blue army, and here there's an additional green army, there's a yellow army, and there's a red army. Now we can keep calling this function, this method, until all the players have 30 armies on the board, but of course that would be boring.
So we can just say, call initialize armies, and it does that, exactly that, until we're done. So now you can see, if you're willing to count, each player has exactly 30 armies on the board. So now we're ready for the first turn, the first turn of the red player. And the first thing the red player may do
is place some armies on the board. So we can do that by calling the reinforce method and give it the player object that it needs to decide on what it is going to do. And then you can see, if you have a close look, that indeed there are a few more armies on the board. So let's go back to the previous situation a little,
and then you can see, for example, here there are now three armies, and then after the reinforcement there are four. So there, in total, three armies have been placed on the game board. Okay, so now then comes the attack phase. Works the same, you just call the attack method, and then if you take a very close look,
let's not go into all the details, but I've actually attacked something, lost some armies, but in the end cannot conquer the territory. Then there's also the fortification phase, and then a few armies move on the board. Okay, so then now is Blue's turn. Let's not go through Blue's turn all the way.
We can just say play turn, and we'll play Blue's turn. Now Blue did the reinforcement, the attack, and then the fortification. And we can keep doing this until the game has ended, or maybe just for 10 turns, and then you can see that Red is doing pretty well, Yellow has taken over South America,
and Green almost took over Africa. So we can keep doing this until at some point the game ends. We can also ask the game whether it has ended so far by just checking all the missions, if one of them has won. And then of course after a certain amount of turns,
in this case it's 54 turns, one player wins, and that's the Red player. Of course this is not a very good strategy the Red player is doing right now because he has all his armies in Australia here. But that is because he's randomly moving armies
and the chance of moving his armies away from there is fairly low, but that's not the point of this. So what I've demonstrated here is that we can actually play Risk in Python. Okay, now let's move on to genetic algorithms. So what is a genetic algorithm? So it's a machine learning algorithm
based on evolution and natural selection. So basically how also nature evolves. So why have I chosen a genetic algorithm? Well, it's easy to use with a little, just having very little knowledge of a problem and it's very robust against noise and many dimensions and that will come in very, very handy.
Okay, now let's have a look at an example. So imagine we're trying to solve this puzzle and the solution of a puzzle is a string of 16 bits and here I'll show the solution. So for example, it's this string of 16 bits here. And now imagine we cannot,
we don't know the answer to the puzzle, but the only thing we can do is provide it with a possible solution and then there's a function that will tell us how many of the bits are correct. So for example, if we tried this string, it would yield seven, because the first four are all correct, the second block is all wrong,
the third block, there's two correct and then the last block, there's only one. So that would yield a total of seven. Now of course, this is a problem that's simple enough to just brute force. We could easily do that. But let's do it the genetic way. So what we can do, we can just start with generating a few random solutions.
So for example, take these four. We can evaluate all of these and then we come up with the score of each solution, which in the first case, the top and the first one and the second one have a score of nine
and the third and the last have a much worse score. So if we just look at these solutions, let's only have a look at the best ones, because we're not interested in bad solutions, right? So we take those top two solutions and then what we can do is we can split them up,
we take these solutions, we call them parents, and I've colored them right here, so there's a red one and a blue one and we can split them up and paste them together to form children. So if we do this, if we take the first eight bits from the blue solution and take the last eight bits from the red solution, we end up with a far worse solution than we've had before. So score nine drops to score six.
But if we do it the right way around, we take the first eight bits of the red one and the last eight bits of the blue one, we find a solution that has a score of 12. Now there's another way to improve a solution a little bit and that is by just randomly mutating a single bit.
So for example, we take the solution we just found with 12, with a score of 12, and we can just randomly mutate bits. So for example, take the second bit and change it from a one to a zero and then we'll see that the score drops to 11. But if we, sometimes we're lucky and we'll find a solution that's a little better than the solution we had before.
Now we can keep doing this. Take two good solutions, combine them, and hopefully find better solutions than mutate them a little and combine them some more. Keep doing this until we have found the satisfactory solution. In this case, we know that the optimal solution will have a score of 16, but in many cases we don't actually know
how good the optimal solution will be. So, in short, a genetic algorithm requires you to have a random, or maybe not random, initial pool of solutions. You need some evaluation function. You need a combine method and a mutate method. And then you can just keep doing that and see your solutions improve.
So, now let's think about how we would implement a genetic algorithm for risk. Well, there's one problem. So when we evaluate, when we're going to evaluate a risk player, we don't have a function to evaluate a score like this. So for taking, when we take these strings of bits,
we can easily evaluate them and get a score that tells us nine, six, or maybe 12. But once we have, if we have four risk players, let's say player one through player four, which one is the best? There is no function that tells us how good it is. So, we could of course have them play a game.
So let's play a game between the four players and then player three wins. Is player three the best then? We don't know. Of course, if we play them more games, there's an element of luck here. So it could be that actually some other player is better. So we can have them play like six more games and then we see that player one is actually
one of the most games. So is player one the best then? Well, likely. We can play hundreds of games until we're satisfied with the precision we've achieved. But of course this takes time. But now what if we have eight players? We cannot have eight players play a single game because the game only goes up to six players.
So of course we could have them play two games. First, player one through player four, they play a game, player one wins. Then player five through player eight play a game and player seven wins. But does this now mean that player one is better than player five? Well, it could be, but we don't know.
We don't really have a way to know unless actually the players from the first pool played against the players from the second pool. So the best solution would be to play games with all combinations of players, which in this case would be about 70. But remember, one game in every combination is not enough,
so we end up playing a couple thousands of games. And this is only for eight players. Now imagine we have 100 players. Then already playing a single game for every combination would require millions and millions of games. This is not a solution, so we cannot do this.
So that's why I've decided to use TrueSkill. So TrueSkill is a Bayesian ranking algorithm developed by Microsoft Research, and they use this to rank players of Xbox games. It's very interesting. Let me give you an example. So imagine we have two players,
and they have never played a game. Then, how well are they? We don't know. We don't know their skill. So if we just plot our belief of the skill of these players on the axis, so here on the x-axis, we see the skill of the player, where in the right would be very good, and in the left would be very bad.
We don't know how good the players are, so there's going to be some distribution. They're likely not extremely good, likely not extremely bad, but just somewhere in the middle. That's all we know. The distribution is fairly broad. By the way, I took these plots from my colleague FinCEN's blog.
If you want to play around with these distributions, go there. So now, before these players, they had never played a game, so we didn't know anything. But what if they do play a game against each other, and then player one wins? Well then, of course we're still not sure how good they are, but at least we think that it's likely
that player one is a little better than player two. Now, if on the other hand, player two wins, then it's likely that player two is a little better than player one. So this is very nice. This is exactly what we expect.
Another very nice property of this algorithm is that imagine that we have one very good player. So the top player here, player one, is very good. We know, already know, it's played many games, and we know it's very good, so our belief of the skill of this player is all the way on the right side of the plot.
And now imagine that player two is new, so we don't know anything about it. And then, so it's just a broad distribution, somewhere in the middle, like before. Now imagine they play a game against each other, and player one wins. Well, we expected that, right? Player one is a very good player, player two probably wasn't too good, most likely,
so nothing much changes. Player one just got a little better because he won another game, and player two, well, nothing much happened. But on the other hand, if the new player, player two, wins the game against player one, then suddenly we realize, hey, he must be very good. So you can see that the distribution of our belief
then goes all the way to the right. And now this means that in order to have an idea of the skill of a player, or the skill of a risk bot, you don't need all players to play against each other. You can just use these, use true skill to get an idea of the skill of the player
and estimate the belief. Well, you could implement this in Python, but of course, like for most things in Python, there is already a package, so you can just install it, and, well, you can just use it like this.
So by default, every player gets a score of 25, we can just make two ratings, A and B, and then you see, so each player has a score, a mu of 25, and a sigma, which is the width of the distribution, of eight. So that's very nice. And then after a game, we can calculate new ratings.
So for example, if we before have, so we have A and B, each with a rating of 25, and then we have them play a game where, with two groups, so we have player A and player B, each in a group, and player, and so the first group won, came in first, and the second group lost, came in second.
Then we can see that actually, so the mu of player A increases to 29, and the mu of player B decreases to, sorry, it breaks off here, to 20. On the other hand, if, so, and then, after this, if then A wins again,
you can see that the score hardly increases anymore, because, well, we already knew that A was better than B. On the other hand, if then B wins after this, the scores go back to fairly close to 25 again, because, well, they both won a game. So this is very nice.
Another very nice thing about TrueSkill is that it can also handle larger groups and multiple players in a tie. So for example, if we have a Risk game that there's only one winner, then we can say, well, player A won the game, and B, C, and D all lost the game.
Cool, so now we can use that to rank our players and evaluate them. Now all we need to do is implement a genetic risk player. As I said before, the player needs to be able to reinforce, attack, fortify, and turn in his cards.
So how would I do this? How do I pick a move? Well, the easiest way to do this is just create a list of all possible moves I could do. So for example, when reinforcing, I have to place an army, then I just make a list of all possible territories I could place the army. And then I rank all these moves based on some criteria,
and then I just pick the top ranked move. So what kind of criteria would I use? For example, if I'm placing a reinforcement, let's define some metric called a territory ratio, which is the number of hostile neighbors around a territory divided by the total number of neighbors around this territory.
So this gives me a measure of how far this territory is into enemy lands. And of course, we can have some metric of how important is this territory for my mission? Of course, if I have to conquer Africa and the territory is in Africa, then I need this territory for my mission.
And I can define multiple, more of these. But let's take these two as an example. So imagine now I have six territories and I need to place an army on one of them. I can calculate this territory ratio for all of these and again also figure out whether the mission is applicable to all of these territories.
And imagine I just, so I own these six territories, they all have an index, and so for territory 10 has a territory ratio of zero, which means that it has no hostile neighbors. So probably you don't want to put your reinforcement army there, right? Now, how do we create a rank of these territories
using these numbers? We can just define a reinforcement rank, which is then the territory ratio times some weight plus a mission times some weight. And let's say that the territory ratio weight is one and the mission weight is two, and we can calculate this rank like this.
So for example, then for the first territory, territory 10, the territory ratio is zero, so times one, that's still zero, plus one times two, that's two. We can do this for all territories and then the top ranked one is territory 18, so we place our army there. But now, how do we figure out
which weights we should use for this algorithm? Well, for this, we go the genetic way. So what we do is we initialize all the players, lots of players with random weights, we play many, many games, we drop the bad players, we combine and mutate the good players to create new, maybe even better players,
and repeat this until we're done. So how does it look? So for example, I take the territory ratio weight and I just randomly distribute, I take a random distribution between here minus 25 and plus 25. Why this interval?
Well, I don't know, I just picked something. And then we play lots of games and we drop all the bad players and we create new players by combining and mutating these weights. And then you can already see that, so this was the initial phase, after one iteration of the genetic algorithm,
you already see that the players with a higher territory ratio weight are definitely better than those with a lower territory ratio weight. This is after one iteration, of course one iteration is not much, after 10 iterations, all those, all the players with a negative territory ratio weight
are almost gone, and after 40 iterations, you see this is a nicely centered peak around 15. So why 15? So what is this 15 normalized to? Well, I did a little trick there. And so for one of the weights, I fixed it such that it could never be larger than one.
In fact, I allowed it to be either minus one, zero, or plus one. And you can see also this, well this one moved a little slower, but you can see that this, after 40 iterations, has a nice peak at one. So this means if you're going to decide where to put your reinforcement, looking at the, so the number of hostile armies
near the territory is 15 times more important than whether or not the territory is relevant for your mission. So I've implemented five factors for reinforcement. So this would be whether or not do I get bonus,
so continent bonus armies for completing a continent, for owning a full continent. So either the direct bonus is then, if I do already own the continent, the indirect bonus is also applied for not having a full continent.
Then there's the mission, then there's the territory vantage, which is the territory ratio I just explained, and you can do the same with armies. So I can show you more plots. So actually, so the army vantage ratio weight goes negative. You can see here at iteration 40,
it's a nice peak, a negative peak. So this means that you should never place reinforcements next to very large armies that are close to your borders, which also makes sense. You can do the same for direct bonus, or for just all the weights I've used for all the player functions.
So this is the result after running 40 iterations. Of course, I could have run more, but it already took me 80 hours of compute time, so I decided to stop. So this is the result. If you want to be a very good risk player, study this. I, by the way, do not claim to be a very good risk player.
So are we done now? Well, no, definitely not. I mean, look at this. There's lots of these distributions have not converged to a single peak. Well, maybe they are. It doesn't really matter so much, but in the end, we do expect this to converge at least a little more. So we could spend a little more computing time on this.
But then, in the end, by using these weights, we only allow linear combinations. Maybe it's much better to use the territory ratio squared plus the mission times something. I don't know. We could implement that, but that would create an even larger dimensionality
and would even take more time to run. And then also, of course, while these players do not look ahead, they don't plan, they don't, and they definitely do not adapt to opponent's plans. So, of course, these are definitely not the most advanced risk players you can make,
but I thought it would be a nice exercise. So, yeah, I'm early, sorry. In conclusion, by now, we can play Risk in Python. I've made a genetic representation of risk player, and I've used a genetic algorithm
to find better risk players. I'm planning to make all the open source, all the code I've used for this. I just need a little time to clean this up. Watch our blog. I'll publish the link to the repository there once I'm done. And until then, any questions?
Yes, questions. I have a question myself. Were you able to identify a superior strategy?
No. Please do come back next year. Thank you very much for everything. I would prefer if you would use the politically correct term of librating the world.
And the other thing is, which precaution did you take to make sure not to develop SkyNet? I'm sorry? Which precaution did you take to make sure not to develop SkyNet?
Well, with SkyNet, you can conquer the world. That's good enough, right? That might be the best strategy. More questions? Thanks for the talk. I really like Risk. So, your best player, is he good?
I mean, have you tried to play against your best player? I have, but... Well, so the interface for playing the game yourself is a little cumbersome. Yeah, if you actually call the method. So I played a single game. It took me an hour or so. I lost. But of course, this isn't really good statistics
to tell whether or not these players are actually better than any human player. I guess that would take too much time to actually figure out.
What's the worst tactic that you've seen? Yeah, so, actually at the beginning, I had really trouble running the first iteration of the genetic algorithm. Turns out, depending on your weights, some of the players do not attack at all. And, so if you have four of these players
in a single game, then you have no attacks at all in your game, which turns out to be a fairly poor strategy. Hi. How much trouble was it to visualize the data
with the map and the numbers? So, how much time did you have to spend on that? That was actually one of the easiest things. It's just, well, taking a figure and then placing these dots and numbers on them. Took me half an hour, I think. A little more of a general question, because when you're designing something like this,
you take the object-oriented approach. So what's the best advice you can give the people if you're starting out with a problem like this? How do you decide how to call what object and where to put what layer of abstraction and sort of keep it all organized when you're designing this for the first time? The best advice I can give you is
just think about it before you start implementing. So, if you implemented, I started with an implementation of the game, which then, in the end, didn't allow any player to interact with it. That's not so useful. Just think about it before you do it and make logical choices of what you represent
in an object and what you don't. Thinking about it, I don't think there's a general rule what is the best solution. Just think about it before you do it. Okay, so thanks for the talk. I have just a question about the, how is it called, true scale metric. It seems to me that it's kind of similar
to the error rating, but just instead of a single number, you just use a distribution to express the player's relative scale. So is this it, or is there anything more to it? I'm actually not aware of the other method you named.
What was it exactly? Okay, I don't know it, so it could be the same. I know that Microsoft patented this one, so there must be something different at least, but it could be very similar. Well, Python 2 or Python 3.
No, I'm joking. Did you try any, is it possible with this kind of algorithm to try some Monte Carlo sampling or something because if you have to play thousands of plays, I imagine you can take advantage of this kind of architecture? Are you asking me whether you can run this as Monte Carlo?
Yeah. Well, in the end, in the end, this is sort of a Monte Carlo, just that you're playing these games and you use random numbers as inputs. I mean, so the first distribution of the territories is random,
and so the probabilities for combat are also random. So in a way, this is Monte Carlo. On the other hand, of course it's not fully random. So these genetic players actually do make decisions on what to do. But I don't think you can take an easier approach
on this than doing it this way. Yes, we have four minutes left, so that's probably time for like three, four more questions. So one here, I saw one over here. You? Yeah, Evan. Have you considered a neural network approach for this?
No, I haven't. To be honest, I was actually, I wanted to implement a genetic algorithm once, and I thought this would be a very good problem to use it on. Neural networks, of course, usually have,
they have other areas where they're very good at. Maybe next time I'll try another board game with a neural network, who knows. So in the RISC game, there's different phases. First there's the initialization phase, and then there's the entire game.
So you could imagine actually splitting up. So what I'm trying to say is, have you tried to find the best initialization strategy by fixing the strategy the player's taken the rest of the game to see if, what's the best kind of initialization you have? And the other way around,
fixing the way the players initialize, and then only optimizing on the second half. I haven't looked at that. At the beginning, when I still only had the random player, I've looked at the importance of the distribution of your territories at the beginning of the game, so not the initialization, but the random distribution of the territories before that,
and there I saw that this has a very large impact on the players, but of course these were random players. On the other hand, so for the genetic players, I've just used a single reinforce method to place all the armies both in the initialization phase as in the main game, so I really cannot say anything about that.
I used to play RISC also a lot, and I remember that making deals with other players, is a very important part of winning the game. Did you take account of that also? No.
So these players are very dumb. They just look at all their possibilities and take the best one, whatever that is. There is no interaction between the players whatsoever. But of course this would be a first step. For example, if you realize that a certain player, for example the yellow player has to eliminate the blue player,
then you might want to defend the blue player to be able to in the end prevent the yellow player from winning and win yourself. All these things are not taken into account here. This is in the end just a fairly simple model. Okay, thank you very much. Probably we will see more next year with all improved strategies, alliances and everything.
Thanks a lot for your talk. Thanks for the questions. Thank you very much.