Shall We Play A Game? - TIB AV-Portal

Shall We Play A Game?

00:00

4

Formale Metadaten

Titel

Shall We Play A Game?

Serientitel

Ruby Conference 2015

Anzahl der Teile

66

Autor

Mitwirkende

Ruby Central, Inc.

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/37556 (DOI)

Herausgeber

Erscheinungsjahr

Sprache

Produzent

Produktionsort

San Antonio

Inhaltliche Metadaten

Fachgebiet

Genre

Abstract

Teaching computers to play games has been a pursuit and passion for many programmers. Game playing has led to many advances in computing over the years, and the best computerized game players have gained a lot of attention from the general public (think Deep Blue and Watson). Using the Ricochet Robots board game as an example, let's talk about what's involved in teaching a computer to play games. Along the way, we'll touch on graph search techniques, data representation, algorithms, heuristics, pruning, and optimization.

Ruby Conference 201519 / 66

1

30:09

Your own 'Images as a Service'

2

34:23

Writing concurrent libraries for all Ruby runtimes

3

35:38

Working Compassionately with Legacy Code

4

42:04

Why is nobody using Refinements?

5

35:31

s/regex/DSLs/: What Regex Teaches Us About DSL Design

6

35:35

Using Ruby In Security Critical Applications

7

31:03

Time flies like an arrow; Fruit flies like a banana: Parsers for Great Good

8

39:23

The Seven Righteous Fights

9

38:09

The Not So Rational Programmer

10

44:22

The Math Behind Mandelbrot

11

29:50

The Joy of Miniature Painting

12

35:25

The Hitchhiker's Guide to Ruby GC

13

28:04

The Art of Ruby Technical Interviews

14

34:48

Tagging your world with RFID

15

46:14

String Theory and Time Travel: the humble text editor

16

25:16

Storytelling via the Psychology of Professional Wrestling

17

42:29

Stately State Machines with Ragel

18

44:33

Softly, softly typing

19

37:24

Shall We Play A Game?

20

39:11

Seven Habits of Highly Effective Gems

21

28:21

RuntimeError: can't save WORLD

22

40:24

Ruby's Environment Variable API

23

32:04

24

27:10

Ruby in 79 AD (Open Sourcing my Role as Indiana Jones)

25

41:38

Ruby 2 Methodology

26

44:21

27

40:06

Not so Neo In the Matrix

28

28:57

Nobody Expects an Inquisition! - A Programmer’s Guide to Asking Questions

29

34:34

Moneyball at the keyboard: Lessons on how to Scout Talented Developers

30

25:48

Mo Money Mo Problems (with Ruby)

31

35:06

Mind Over Error

32

28:47

Messenger: The (Complete) Story of Method Lookup

33

32:13

Manage Your Energy, Not Your Time

34

34:41

Making it on your own and the pitfalls of gem dependencies

35

1:24:22

Ruby Conference 2015: Lightning Talks

36

45:47

Learn to Make Music. With Ruby.

37

27:32

Keynote: Stupid Ideas for Many Computers

38

43:29

Keynote: Leagues of Sea and Sky

39

42:55

Keynote: Consequences of an Insightful Algorithm

40

1:12:38

Keynote and Q&A: Matz

41

36:30

Just a Ruby Minute

42

43:57

JRuby 9000 Is Out; Now What?

43

39:10

Inside Ruby's VM: The TMI Edition.

44

40:30

I Estimate this Talk will be 20 Minutes Long, Give or Take 10 Minutes

45

44:08

How to Stop Hating your Test Suite

46

30:03

How to Performance

47

42:11

How to Crash an Airplane

48

42:30

How does Bundler work, anyway?

49

26:31

Hardware Hacking: You can be a Maker

50

37:50

Hacking Spacetime for a Successful Career

51

42:57

GDB: A Gentle Intro

52

34:55

Extremely Defensive Coding

53

46:41

Everything You Know About the GIL is Wrong

54

35:57

Domo Arigato mruby Roboto

55

30:10

Design Thinking for Rubyists

56

37:13

Cucumbers Have Layers: A Love Story

57

37:39

Communicating Intent Through Git

58

24:29

Code, Culture and the Pursuit of Happiness

59

34:41

Changing the Unchangeable: The Hows and Whys of Immutable Data Structures

60

40:23

Building CLI Apps for Everyone

61

28:36

Botany with Bytes

62

41:17

Bikeshed! Live!

63

45:47

Beating Go thanks to the power of randomness

64

29:33

A Tale of Two Feature Flags

65

42:02

A Muggle's Guide to Tail Call Optimization in Ruby

66

40:13

A Guided Read of Minitest

Automatisches Abspielen

Sprache

Text

Bild

00:00

SpieltheorieMetropolitan area networkSpieltheorieNeuroinformatikDatenstrukturAlgorithmusGüte der AnpassungComputerspiel

00:39

SpieltheorieDatenverarbeitungssystemHIP <Kommunikationsprotokoll>DreizehnGruppenoperationComputerschachZeiger <Informatik>SpieltheorieComputerTouchscreenFlächeninhaltMultiplikationsoperatorKlassische PhysikNeunzehnVorlesung/Konferenz

01:12

SpieltheorieDatenverarbeitungssystemDatenstrukturSelbst organisierendes SystemMultigraphAlgorithmusRechnernetzFacebookBinder <Informatik>RoutingAlgorithmusSpieltheorieMultigraphTopologieMAPVersionsverwaltungSoftwareGraphHierarchische StrukturAutomatische HandlungsplanungEinfach zusammenhängender RaumProjektive EbeneApp <Programm>StatistikRobotikVerschlingungMapping <Computergraphik>OptimierungGRASS <Programm>Formation <Mathematik>WhiteboardNichtlinearer OperatorComputeranimationVorlesung/Konferenz

02:25

SpieltheorieRoboterSpieltheorieWhiteboardRobotikQuadratzahlZahlenbereichSymboltabelleMini-DiscComputerFlächeninhaltBitRichtungOptimierungProjektive EbeneCASE <Informatik>Zeiger <Informatik>Ordnung <Mathematik>QuellcodeSchlussregelGreen-FunktionRoutingDifferenteComputeranimation

04:55

WhiteboardMinkowski-MetrikAggregatzustandTeilbarkeitRoboterOrtsoperatorDifferenteTopologieVererbungshierarchieRobotikAlgorithmusAggregatzustandDivergente ReiheMarketinginformationssystemWhiteboardKombinatorikRichtungMathematikSchaltnetzMultiplikationsoperatorOrtsoperatorComputerspielDatenstrukturZellularer AutomatInformationsspeicherungWeg <Topologie>InformationDynamisches SystemHydrostatikServerHalbleiterspeicherInformatikGeradeSelbstrepräsentationBitWurzel <Mathematik>Klassische PhysikSpieltheorieWeb logQuaderMinimumQuadratzahlRoutingRechter WinkelMAPAbstimmung <Frequenz>Heegaard-ZerlegungRechenwerkEinfache GenauigkeitFlächeninhaltComputeranimation

08:08

AlgorithmusDifferenteAlgorithmusMarketinginformationssystemAbenteuerspielRoutingVerzweigendes ProgrammAuswahlverfahrenComputeranimation

08:42

Mailing-ListeSystemaufrufCodeVerzweigendes ProgrammWurzel <Mathematik>TiefensucheWhiteboardDreiecksfreier GraphTopologieAggregatzustandAlgorithmusRechter WinkelDickeRobotikComputeranimation

09:59

RoboterKompakter RaumTiefensucheKreisflächeRechter WinkelRobotikDreiecksfreier GraphCodeWeg <Topologie>Mailing-ListeAggregatzustandZeiger <Informatik>DifferenteBenutzerbeteiligungComputeranimation

11:06

AggregatzustandMailing-ListeTopologieOptimierungsproblemComputeranimation

11:36

TopologieDifferenteMultigraphRobotikVererbungshierarchieGraphAlgorithmusAggregatzustandDreiecksfreier GraphTopologieRichtungMarketinginformationssystemSpieltheorieKette <Mathematik>SchlussregelComputeranimation

12:10

DickeVerschiebungsoperatorDatenstrukturMultiplikationsoperatorWarteschlangeGeradeAlgorithmusArray <Informatik>SchlussregelLoopSpieltheorieMinimumCodeVerschiebungsoperatorMAPMailing-ListeVerzweigendes ProgrammZellularer AutomatDickeDreiecksfreier GraphMarketinginformationssystemRobotikTopologieKonditionszahlVersionsverwaltungOptimierungsproblemRechter WinkelRichtungZahlenbereichQuantenzustandAggregatzustandForcingMessage-PassingCASE <Informatik>Nichtlinearer OperatorQuick-SortComputeranimation

15:12

RechenwerkVerschiebungsoperatorMailing-ListeGotcha <Informatik>AggregatzustandZahlenbereichZellularer AutomatLoopIterationMailing-ListeAlgorithmusCASE <Informatik>BitComputeranimation

16:03

Prozess <Informatik>AggregatzustandGlobale OptimierungHeuristikThumbnailSchlussregelHeuristikRobotikMAPOrdnung <Mathematik>Globale OptimierungMultiplikationsoperatorThumbnailProzess <Informatik>Zellularer AutomatCASE <Informatik>NP-hartes ProblemSpieltheorieSchlussregelGraphKontextbezogenes SystemAggregatzustandBitTopologieComputeranimation

17:40

RoboterHeuristikIndexberechnungAlgorithmusKreisflächeKreisbewegungGeradeRobotikAutomatische IndexierungMultiplikationsoperatorStichprobenumfangAlgorithmusHyperbelverfahrenCASE <Informatik>HeuristikDreieckSpieltheorieLogarithmusMereologieWort <Informatik>Gesetz <Physik>ComputeranimationDiagramm

18:50

VerschiebungsoperatorCodeWarteschlangeComputeranimation

19:24

MultiplikationsoperatorRechter WinkelTopologieSpieltheorieStichprobenumfangTeilbarkeitAggregatzustandGraphZahlenbereichTotal <Mathematik>PunktRobotikMAPWhiteboardMinkowski-MetrikInformationsspeicherungRichtungZellularer AutomatDatenverarbeitungssystemProfil <Aerodynamik>Computeranimation

20:31

Zellularer AutomatMinkowski-MetrikMereologieRechter WinkelGraphGreen-FunktionInformationsspeicherungMultiplikationsoperatorZellularer AutomatRobotikAggregatzustandTeilbarkeitZweiComputeranimation

21:00

Zellularer AutomatTablet PCÄquivalenzklasseRobotikAggregatzustandTeilbarkeitLuenberger-BeobachterZahlenbereichOrtsoperatorGraphfärbungWhiteboardBitÄquivalenzklasseSelbstrepräsentationKlasse <Mathematik>SchlussregelComputeranimation

21:59

WhiteboardRoboterÄquivalenzklasseSchnittmengeÄquivalenzklasseSchlussregelOrtsoperatorBitZahlenbereichWhiteboardSelbstrepräsentationHash-AlgorithmusComputerKlasse <Mathematik>Computeranimation

22:42

ÄquivalenzklasseAggregatzustandWhiteboardRoboterHash-AlgorithmusQuick-SortKlasse <Mathematik>ÄquivalenzklasseAggregatzustandWhiteboardBitQuantenzustandQuick-SortGlobale OptimierungGeradeSchnittmengeDifferenteMereologiePaarvergleichArray <Informatik>SoundverarbeitungDigitaltechnikOrdnung <Mathematik>ResultanteMultiplikationsoperatorElement <Gruppentheorie>ZweiComputeranimation

23:46

Array <Informatik>RoboterTablet PCWhiteboardZellularer AutomatObjekt <Kategorie>Profil <Aerodynamik>BitAggregatzustandZellularer AutomatRobotikMultiplikationsoperatorRichtungFunktionalMinimumWhiteboardOrtsoperatorEinsCodeURLFigurierte ZahlRechter WinkelComputeranimation

24:58

Array <Informatik>RoboterObjekt <Kategorie>Tablet PCIdentitätsverwaltungAggregatzustandWhiteboardGleichheitszeichenAlgorithmusEinfügungsdämpfungDifferenzkernObjekt <Kategorie>IdentitätsverwaltungVariableInstantiierungBitrateDatensatzComputeranimation

25:27

Objekt <Kategorie>RoboterIdentitätsverwaltungSpieltheorieAlgorithmusZweiStichprobenumfangGeradeMultiplikationsoperatorTopologieKomplex <Algebra>MAPVerzweigendes ProgrammComputeranimation

26:13

Quick-SortWarteschlangeMAPPrioritätswarteschlangeSurjektivitätEinsTopologieImplementierungDifferenteElement <Gruppentheorie>AggregatzustandCodeAuswahlaxiomSystemaufrufSchnittmengeEvoluteSuchverfahrenApp <Programm>Computeranimation

28:17

AlgorithmusAbstandSchätzungZellularer AutomatStatistikMultiplikationSchätzfunktionGeradeMAPKategorie <Mathematik>RobotikCodeZahlenbereichWhiteboardSchätzungZusammengesetzte VerteilungFunktionalMathematikEinsDifferenteMultiplikationsoperatorAbstandAggregatzustandBitAlgorithmusZellularer AutomatTermEvoluteDickeHeuristikKlasse <Mathematik>MereologieGruppenoperationSpieltheorieForcingArithmetisches MittelNeuroinformatikInelastischer StoßPunktTabelleComputeranimation

32:15

HeuristikRoboterQuick-SortRichtungPrimitive <Informatik>Objekt <Kategorie>TypentheorieParallelrechnerWhiteboardRobotikZellularer AutomatSpieltheorieParallele SchnittstelleDatensatzRichtungTypentheorieMotion CapturingObjekt <Kategorie>TouchscreenProfil <Aerodynamik>OrtsoperatorMultiplikationsoperatorCachingFigurierte ZahlAlgorithmusMereologieKollisionserkennungRechenschieberHilfesystemFormale SpracheWeg <Topologie>VererbungshierarchieDezimalzahlComputeranimation

34:47

App <Programm>RechenschieberBenutzerbeteiligungProfil <Aerodynamik>MultiplikationsoperatorDatenstrukturCodeEDV-BeratungHierarchische StrukturSoftwareentwicklerProjektive EbeneTwitter <Softwareplattform>BitrateSelbst organisierendes SystemOptimierungATMKartesische KoordinatenTopologieObjekt <Kategorie>CASE <Informatik>SoundverarbeitungInhalt <Mathematik>Rechter WinkelVorlesung/Konferenz

Transkript: Englisch(automatisch erzeugt)

00:16

Let's go ahead and get started. My name is Randy Coleman. I work at a company called Zeal,

00:21

and we're gonna talk about teaching computers to play games today. So, computer game playing has been a fascination for years and years and years, and actually a lot of the things we know today came out of teaching computers to play games, a lot of algorithms that we use, data structures, search techniques, et cetera. There can be games as simple as Tic-Tac-Toe,

00:41

which I'm sure we've all played. Teaching a computer to play Tic-Tac-Toe is not actually that hard anymore. Then you can get into some slightly more complex games that you might recognize, or maybe not, and then you can get into the really advanced stuff, which you've probably never, ever played before. If you don't recognize the screenshots, they're from a movie called War Games in 1982.

01:03

Classic, Matthew Broderick, Ali Sheedy, when they were like really, really young. Anyway, so games, like I said, have been an area of research for a long time, and you might recognize this more recent example. They taught Watson to play Jeopardy, and now they're using it for all kinds of other things.

01:21

So what I'm gonna talk about today is mostly really simple game stuff, pretty basic level, but it's at least a way to get started, and a lot of what game playing involves are graphs, trees, and algorithms. So why would we care about those? I mean, yeah, they're great for games, but what else? Well, we might have to represent hierarchical structures in our apps.

01:41

I actually worked on a project six or eight months ago where we had a hierarchical structure we had to represent and work with, and actually was able to suggest one of these algorithms to solve a problem that people had with that. Relationship maps, you know, you are only two connections away from this person on LinkedIn, that's a graph,

02:02

and there's algorithms for traversing those graphs. And no, I'm not gonna ask you if I can add you to my personal network, don't worry. Trip planning, trip planning is a graph problem. Network routing, how do you get your packets from here to there, that's another graph problem, and these algorithms, much more advanced versions

02:21

of these algorithms can be used in these kind of places. But those are kind of boring business problems. Let's talk about game playing instead. So for my example, I'm gonna use a board game called Ricochet Robots. I was introduced to this game a couple years ago, and as soon as I played it, I thought, I bet you I could write a program that would play this game for me. And so like any side project,

02:40

what you do is you have this idea, and you let it sit, and then you propose a talk on it, and then you have six weeks to write the game player and the talk, and get it all done. So that's my bonus lesson on how to side project. Don't do it that way. Anyway, so in Ricochet Robots, you have a 16 by 16 grid, like you see here.

03:00

The center four squares have a little gray thing on them that you can't go there. And on the grid, there's some walls, and then there's those colored areas are called goal targets. And so a game of Ricochet Robots, you have to move the robots around the board and achieve the different goals depending on which turn it is.

03:20

So the robots are distributed randomly about the board when you start. And then you turn over a little disk with one of the goal symbols on it, and that tells you what you have to do to play. So this game can be played by any number of people, which is really cool. Most games you have your own little token you have to move. And so everybody looks at the board and tries to figure out how many moves does it take

03:42

to get the green robot into the green square. You can move any of the robots, but robots can't stop once they start unless they hit something. So they can run into walls, or they can run into other robots, but they don't stop. And then they change direction and go a different direction. So in this case, to get the green robot into the green square,

04:04

I can come up with a seven move solution, which looks like this. You move the blue robot around, and then chase it around with the green robot, bounce off the blue one, and down into the goal. And that's basically how the game is played. And then you turn over another disk, and so there's 17 disks, and an entire game is made up

04:20

of going through all 17 disks. And you're just trying to be the person that can find the shortest solution. There's a little bit of a chance to find a better solution. You have about a minute to do that. There's a tiebreaker rule, so that if two people come up with the same length of solution, the person who's behind in the game gets to trump the other player,

04:41

which really helps to even the game out a lot. Even if you're not the fastest player, you still have a chance to stay in the game. So it's a lot of fun, it's pretty cool. But if we're gonna teach a computer to play this game, you have to kind of get a sense of the scope of the problem you're dealing with. So in this game, there are actually 976 and a half billion possible states the board can be in.

05:01

So you've got the cells and the walls, there's five robots and 252 squares, and you do the combinatorics math for that, and you get 976 billion. That's a lot. And at each level, from each state of the board, there's between nine and 20 possible next states you can get to depending on which robot

05:20

you move in which direction. Sometimes robots are stuck in a corner, that's why there might be less moves, but nine to 20 moves. So that's a pretty huge amount of data, and you're not gonna really get through all that in a reasonable amount of time. So you need to come up with some ways around that, and we'll be talking about that today. So the first thing you need to do when you wanna write a computer game player is you need to figure out

05:40

how am I gonna represent the game? What are the data structures gonna look like? So we have to start by representing the board. And if we think about the game a little bit, the board never changes. The board itself never changes. You have the 16 by 16 grid with the center island, that's static. Then you have the walls and targets. There are different combinations of those,

06:00

but for one single game of 17 turns, that never changes either, that's static as well. The goal cell changes each turn, so a little more frequently than the walls, but less frequently than some other things. And then the robots, every time you move a robot, the robot positions change. So you got this combination of really static information

06:21

and really dynamic information, and if you have to keep track of a lot of states of the board at a time, you wanna minimize the amount of storage you need for that amount of data that you need to store. And so you wanna kinda split up what's static and what's dynamic so that you can reduce your storage. That's one of the things you'll find solving big problems is storage efficiency is often almost as important as actual time efficiency.

06:41

And a lot of times what you do is you trade off, oh, I can use up a little more storage, but then I can get faster. Or, gee, I'm running out of memory on my servers and I can't add more, so I need to maybe trade off a bit of speed so I can reduce my storage usage. And that's classic in a lot of performance problems. The next thing we have to represent is how do we figure out how to represent robot movement.

07:03

And what we can do is we can represent the board states and then the movement of robots as lines between them. And so here I've just got a few examples of moving the red robot a couple directions, moving the green robot a little bit. And so you can think of the board states as these boxes with the lines between them being which robot moved

07:22

to get to that new state. And this data structure, you may not recognize it or you may, is actually called the tree. It's a fairly classic computer science data structure. For some reason we always draw them with the root at the top instead of at the bottom like a real tree. Don't know, it's like starting from zero, right? So in a tree, the very topmost node is called the root

07:42

and it has no parent nodes above it. And every other node has exactly one parent and that's what defines a tree, exactly one parent. And then at the very bottom, the nodes that don't have any children below them are called leaf nodes. And when you're working with trees, you need to figure out how to search through the trees. And there are a bunch of different algorithms for that.

08:03

Trees are really good for like doing maze solvers as well and Jamis Buck wrote a series of blog posts and turned it into a little book called Basil and Fabian Three Ways Through. It's a super accessible introduction to these algorithms. It goes through a bunch of different algorithms for solving mazes. It's cast in a fable.

08:21

You've got a wizard and his apprentice and they have to get to a castle but they have to go through mazes to get to the castle. And so the wizard is teaching the apprentice how to solve the mazes to get through to the castle and a bunch of other adventures along the way. Great book. So if you're at all interested in this, I highly recommend that book. It's really cool. The first algorithm I'm gonna talk about is probably the simplest one to think about.

08:41

It's called depth first search. And a depth first search, what you do is you start at the root and you follow one branch all the way to the end and then you pop back up and go down again. So that looks like this. So you go all the way down the left then start moving across the tree until you get to the end.

09:00

Okay, so that's a depth first search. Typically depth first search is implemented using recursion and that might look something like this. You don't have to super understand the code but I wanted to show it for people who are more interested. So I have a solve method that generates the initial state or a path from the initial state and then solves that recursively

09:20

which builds up a list of candidates and then you go through all the candidates and find the shortest one, so the smallest by length. And if you don't find one, then just return that there's no solution. And the recursive solution, you look at the path and if it's solved, then you put that on your list of candidates. Otherwise, you find all the allowable successors

09:41

of that path. So every possible next state I could get to from where I am now and for each of those, make a recursive call to solve recursively and try again. And that's a recursive solution and that implements depth first search. Now you might notice that there could be some problems with this algorithm. For example, you can get into cycles on the board.

10:01

So the green robots up there on the upper right and it can get going around in circles. And if you have a naive depth first search algorithm, it's never gonna finish because that robot's just gonna keep going around in circles and you're never gonna stop.

10:20

So you have to build in some kind of cycle detection. So what I started with here for allow, this is computing the allowable successors. You haven't seen this code yet, but look at all the possible moves, generate a successor for those moves, get rid of any nils and successor just says, oh, give me the next robot, which is the robot moved in a certain direction

10:40

and generate a new path based on that. So you have to change that if you've got cycles to reject any cycles and a cycle is detected just by keeping track of all of the states you've been to before. Now notice the way I've done this here, the visited list is kind of new for each different path.

11:04

And the reason for that is that you can get to the same path in two different ways. Now, if I explore the path on the left first, I'm gonna get to that place in four moves. And then later when I explore the other path, I get there in three moves, but I've already seen that state. So I'm gonna throw it out.

11:21

I visited that state. I don't need to explore it again, I think, except that I just found a shorter way there. So that's why each different path I explore has to have a different visited list because otherwise you don't find the optimal solution, the shortest path. So really what's going on here is this nice tree picture I showed you. It's not really a tree.

11:41

It's a graph. And so a graph is like a tree except that nodes in a graph can have multiple parents. Okay, so that's really the only difference. There's different kinds of graphs. I won't get into that too much. But this is really a graph. And so you can get to the same state from different directions. Now, this algorithm with the cycle detector

12:01

could be used to solve mazes pretty easily. Jamis' book shows that pretty well. But in this game, Ricochet Robots, there's some extra complications. For example, if your robot starts on the goal cell right away, that's not a legal solution. I mean, this is a zero-move solution. You're in the goal cell, right? Now, the rules say the robot has to ricochet

12:21

at least once, it has to change direction to the left or the right at least once. So that's not a legal solution. So you have to be able to get out of there and back into the goal. And again, that would look like a visited state, but it's not because it wasn't legal the first time. So you have to watch for that. There's a similar one where the robot starts one move away from the goal. And in this case,

12:40

the shortest solution actually takes you through the goal. So you go around here. Now, that totally looks like the cycle we just saw a few minutes ago. But because it involved an illegal solution, it's not actually a true cycle, and you have to watch out for that. So different game playing algorithms will have different complications based on the rules of the game you're playing. Like tic-tac-toe is very simple.

13:01

Maze solving is relatively simple if you're watching for cycles. But this game is a little more complex because there are optimal solutions that are not legal solutions, and you have to watch for that. And I thought I had this tapped in an early version of my solver, and then was playing with some different algorithms and found out I was missing like known shortest solutions for some reason,

13:22

and it's because I did not have this condition right. I'm still not positive I've got it right. It's really tricky to get right. Anyway, so once you get back to that state, then you can get in the goal and you win the game. So overall, Dev First Search is a relatively simple algorithm, but for this game, it's not great because you have to search the whole tree so that you can figure out which solution was shortest.

13:41

You gotta go down to the bottom of all of the leaf nodes and then find which one is the shortest path, and with 976 and a half billion states, not gonna happen. So we need a better algorithm. So the next one I looked at is called Breadth First Search. And a Breadth First Search algorithm, instead of going down one branch of the tree,

14:00

you'll go across the levels of the tree, and that looks like this. So we look at the first level and then down one more and down one more till you get to the end. So that's a Breadth First Search, going across the tree, across the breadth of the tree, not down the depths of the tree. That looks like this in code.

14:20

You start with an array, and the array's actually serving as a data structure called a queue, which is like a line up of a thing. First person in the line is the first person out at the other end getting served. That's a queue. So that array serves as a queue, and then as long as there's paths in that queue, we'll take the first one off the list. That's what the shift does. If it's a solution, we'll return it.

14:42

Now, with Breadth First Search, the first solution we find is guaranteed to be the shortest solution because we've looked at all the one move and the two move and the three move solutions before we get to the four. So the first time you see a four, that's my solution, shortest path, good to go. And then if that wasn't a solution, then you take all the allowable successors of the path, shovel them onto the end of the queue,

15:02

and so you'll get to them when you get to them. And then if you get through that loop and there's no paths left to explore and you haven't found a solution, then there's no solution. So the really cool thing about Breadth First Search is now we can have a global visited list because if I come across a state I've already seen, I know that I came across it in at most the same number of moves.

15:22

It was either less moves or the same number because I've searched all the shorter paths already. So that's pretty cool, and that looks like this. You just make a set up at the top with all the visited cells, and then you break out of, or you go into the next iteration of the loop. If you find one that you've already seen,

15:40

and then otherwise you shovel it onto the visited list. So that's cool because now you can use this global visited list. You don't have to worry about the case where there's a shorter path, which is pretty cool. But even with Breadth First Search, I mean it worked pretty well, but it was still too slow. And that's when you have to start trying to optimize your algorithms a little bit. So there's basically two things you can do,

16:00

and I've added a third here that kind of plays into the other two. So you can either do less things, or you can do things faster, and you can use heuristics to kind of choose which of those to do, and I'll talk about heuristics in a minute. So doing less things in this context really means, can we process less of that game graph? Less states, can we just process fewer states? Can we find ways to do that and get away with it?

16:23

Doing things faster would be doing less work per state, or doing that work faster somehow. And heuristics, really that's a fancy name for a rule of thumb. And the thing about heuristics or rules of thumb is that sometimes they work and sometimes they don't. You might find a rule of thumb that works almost all the time. If you were in Andre Arco's talk earlier,

16:42

he talked about Bundler, where they've got this really, really hard graph search problem to resolve dependencies, and they've come up with a bunch of heuristics that make it work really fast most of the time. There's still some degenerate cases you can get into, but they've got a bunch of heuristics in there for making things faster, and that's exactly what I'm talking about here. So there's times that heuristics will work great, and there might be times that they just don't.

17:03

Okay, so I'm gonna kind of take you through the optimizations I did in the order I did them in, because I think it tells a better story. So they're gonna be kind of jumping around between these three things a little bit. The first heuristic I tried was, from playing the game I realized the last move you make in any turn is to move the goal robot into its cell,

17:24

or I call it the active robot, the robot that you're trying to get into the goal cell. So that's gonna be the last move you make, and so at any level of the tree, if we look at, let's try moving that active robot first because if I'm like one step away from the goal, I'll find it first, and I don't have to look at all the other robots first. So I tried that heuristic.

17:43

So here I just, before I start calling the solver, I just find the index of the goal robot and use a rotate method to get that to the front of the array. Rotate's kind of a cool method. And this is what happened. All my timings, I had one sample game and I ran through all 17 turns, and the blue line with the circles on it

18:02

is the original algorithm, and the red line with the triangles is the new algorithm where the active robot moved first, and you can see that for the most part, the red line stays below the blue line. The places where it's exactly the same are places where the original algorithm already had the active robot first, so they're the same algorithm in that case.

18:20

There's that one outlier case where this heuristic just didn't work. You can see the big peak where the red is separated from the blue. That kind of skewed all the timings, but like I said, heuristics sometimes work and sometimes don't. I never have figured out why that one was so bad. Just in that case, it didn't work well. So the next thing I did,

18:40

I decided to keep that heuristic for the rest of the examples because it looked like it was slightly better in most cases. So the next thing I realized is that I was checking for solutions later. I was checking when I'd start to process a path, I'd check whether it's a solution. So this is the code we saw a little bit earlier. And so what that's doing is

19:01

I don't find the solution at node 16 there until I get there in the search. I've gone through all 16 nodes before I get there. But if instead I look for solutions as soon as I generate the successors, so like this, I get the allowable successors, I look to see if any of those are a solution, and if they are, I return it right away, and then I put the successors onto the queue.

19:23

So what that means is that I would find the solution at node 16 when I was processing node six. So even in this really small tree right here, I was able to throw out 10 nodes that I didn't have to look at. That's pretty cool, and it turned out to be a really big savings in time. So this graph is the total number of states

19:41

I had to consider to play my sample game, and it was almost a factor of three fewer states to consider. So that was a huge win because you just don't have to explore all the rest of the level you're on and all the next level to the point where you find the solution. So that was a big win. Then I pulled out my profiler,

20:00

and if you saw Eileen's talk this morning, which I unfortunately didn't, but she talks all about how to profile and tune performance. I pulled out my profiler, and what I found that I was spending way too much time trying to figure out where robots were gonna stop, because basically what I was doing is saying, let's try moving one. Can I move? Can I move? Can I move? Can I move? That's a lot of work,

20:21

and so what I did is at the beginning of the solver, I take the board and I compute that if there's nothing else, no other robots in the way, where would I stop from any given cell? Where would I stop going in any of the four directions? And so I've got, what I'm doing here is I'm trading off space for time. I'm taking up a little more storage space to store this graph or this chart of stopping cells,

20:43

but that's gonna allow me to have more speed in my algorithm, and I wasn't running out of memory, so I was okay making this trade off. So for example, the green cell right there, I would say, oh, if I'm gonna move up, I'm gonna stop at the top wall, and if I'm gonna move left, I'll stop at the other wall. I still have to look for robots being in the way, but at least this part of it is much faster,

21:01

and certainly it was. This is the number of states processed per second, and it was factor one and a half or something like that, so I could process that many more states in the same amount of time, so that was a pretty big speed up as well. The next thing I realized, I was looking at some other solvers online, and there was a talk by Michael Fogelman, and he made the observation

21:21

that the only robot you really care about is the goal robot, and the colors of the other robots don't matter at all, so these two states here, the green robot is the same in the same position. All the other robots are in the same positions, but they're different colors in those positions, and I can treat these two states as equivalent

21:40

because the only robot I'm caring about is the green one. It doesn't matter if I bounce off the blue one or the yellow one. I don't care, so these two states are actually equivalent, and if I treat them that way, then I can save myself a little bit of work. Well, it turns out I was also able to process states faster because what I did is I computed for each board state

22:00

something called an equivalence class, so some representation of which states, two boards that are equivalent by this rule would have the same equivalence class, and so what I did is I computed a set of a position hash for each robot, and then for the active robot, I just added a thousand, so it came out as a different number,

22:21

so for example, I might have, like, I don't know, a 131 and a 242 and a 15 and a 1,001. It doesn't matter. I just wanted those two sets to be equal if they had the same numbers in them without caring about order. Anyway, so I computed this equivalence hash,

22:41

and that did allow me to save a little bit of work, a few less states. There's not a lot of savings there, but some, but it turns out that comparing those equivalence classes, which was faster than actually comparing the whole board state like I've been doing before, so it turned out to be do things faster as well because comparing those equivalence classes was faster,

23:02

so that was kind of cool, and then another little mini optimization, I discovered that if I take those equivalence classes and instead use an array and sort it, so I'm not sure if you can see the line there, but I've crossed out the set new part there and added a dot sort on the end, so it turns out that comparing sorted arrays is faster than comparing sets in Ruby,

23:22

which, if you think about it, makes sense because in sorted arrays, you can compare the first element, second element, third element, and stop as soon as you find a difference. Sets are not ordered that way, so they compare in a different order, and they can still short circuit, but this was actually faster. Even normally sorting slows things down if you have to keep sorting all the time, but I sort once and cache the result

23:43

and do the comparisons a lot, so this was actually a decent savings, so bars are going up again a little bit. The next thing the profiler told me was that I was creating a lot of objects I didn't need to create, so in here, at the bottom there,

24:03

there's a method called with robot moved on board state that returns a new board state with that robot moved in a given direction, and then I can ask a robot, forgive me yourself, move to a new location, so this is very functional style code because I needed to be making copies of all the states because I needed to keep the old ones around

24:20

to consider them. And so what was happening is sometimes asking for the robot to move in a direction that might be right up against a wall already, and I'd get back a copy of the robot in the same position it was already in, and I figured out I don't really need to be creating those new robot objects all the time. So what I do now is I figure out

24:43

what's the next cell the robot's gonna be in, and if it's the same cell it's already in, then just return self, otherwise make the new robot, and then I check for that in board state where if I get back the exact same robot object, then I can just early out. That was a pretty big win.

25:00

Creating less objects is always a good win, and then I also realized that there's some places where I could do identity checking instead of deeper equality checking. So identity checking is whether two objects are the identical same object versus checking whether they have the same values in their instance variables. And so right here I could use the equal question mark method,

25:21

and that was faster as well, and again that was a pretty big win. So let me sum up where we are so far. Those bars have dropped a lot. The solver got a lot faster. The longest turn in the sample game I had was, it's down to about just over a minute I think.

25:44

The whole game it plays it in about three minutes and seven seconds, and it's a fairly complex game. I've had games where it can play the whole game in 30 seconds, which is insane. I don't wanna play against it anymore. So that's kind of where I'm at now with performance. There's still some longer solutions that it probably wouldn't find in a reasonable amount of time and the human could probably still beat it,

26:01

so I wanted to make it faster, and I've been working on that without much luck, so I'm giving away the punchline. This is about as fast as I've gotten the thing so far, but I do wanna talk about another algorithm called best-first search. So again, here's our tree. Now in a best-first search, you don't necessarily go down a branch. You don't necessarily go across the levels of the tree.

26:21

You try to figure out which node of the tree is the next best choice to explore, and you're kind of working from a fringe of the ones you've already explored, but you decide, oh, what's the next best one? So that might come out looking something like this, where you're kind of bouncing around the different levels of the tree,

26:41

but you're not gonna always do the shortest solutions first. So that ends up looking like this. What you do instead of using a normal queue, like we were using for breadth-first search, we use something called a priority queue, and what a priority queue does is it sorts the elements in the queue based on some priority value so that the higher priority values come first,

27:01

and it orders them by that way. So I found this gem called fast containers that has a pretty fast priority queue. It wraps a C implementation. I played with, there's actually a sorted set in Ruby. Don't use that in this horrible performance-wise, especially if you need to iterate it, because to do each, it basically,

27:21

oh, I forget what it does, but it's very bad performance. This one was actually pretty good. So again, you find the successors look for solutions. There's a couple of calls there to add path and add paths, and those look like that. So add path just iterates all the successors and calls add path, which pushes it onto the queue,

27:42

it pushes the path onto queue along with the score, and I've got this queue set up to sort by lowest score first, and so the scoring method is what's interesting, and that's basically how you decide what's the best next state to explore. That scoring method has all the magic here. This kind of looks like small talk code,

28:00

and I was a small talker. There's kind of a saying about small talk code is it always looks like the work happens somewhere else, and you can kind of see that here. The work's always happening somewhere else. Well, the score method is where the magic happens. So there's a bunch of different kind of best search algorithms, but the best one is called A star. There's an earlier algorithm called Knuth's algorithm,

28:20

and A star is an evolution of that that's a little bit better, and so an A star, like I said, each state is scored and the lowest score is best, and the score is the distance traveled so far, how many moves have we made so far, plus an estimate of how many moves we have left to go. Now that's gonna be a guess

28:40

because we don't actually know. If we did, we'd already know how long the solution is, but we need some kind of estimating function. So that looks like this in code. The score method just computes path.length plus best estimate. Best estimate is basically go through all the active robots.

29:01

There can be multiple active robots because there's one target on the board that any robot can move to, so there can be more than one. That's why I need a map here. Just find the estimate at that cell and find the smallest of those estimates and return that. Now there's a couple of properties of the A star algorithm. First of all is you cannot overestimate on your scoring function.

29:21

So your estimate of how much distance you have left to go has to be lower than or equal to what the actual solution will be. If it overestimates, it doesn't work. And then there's another more convoluted property that if your estimation function is monotonic, which is a fancy math term I'll explain in a second,

29:43

then it's guaranteed that the first solution you find is the best solution. If it's not a monotonic scoring function, then you kind of have to keep going for a little bit longer, not too long. It's not horrible, but a monotonic estimating function is better. So what I mean by monotonic is that if I have an estimate of say four and I move one move,

30:04

then I can't be any lower than three. Like I can't make a move to go from a four to a two. So that's basically what monotonic means. So now how do we figure out what's our estimating function? And again, I stole this idea from Michael Folgeman.

30:21

Thank you very much. What he suggests is you pretend that the robots can stop anywhere they want. They don't have to hit a wall first. How many moves would it take a robot to get to the goal cell if it could stop exactly where it wanted to stop? So if you're already on the goal cell, that's obviously zero.

30:40

You have no more moves left to make. Anything that's in a straight line from there is a one. And then anything that's in a straight line from the ones is a two. And anything in a straight line from the twos is a three, and fours, and fives. So from anywhere on this board, it's no more than five moves to get to the goal cell. Now my colleagues tell me

31:00

that basically I could write a Minesweeper player now that I have this picture, but I don't know. Anyway, so that's the estimating function. I'm not gonna show you the code for it. This is something that you can pre-compute as soon as you know what the target cell is. You can pre-compute all these numbers for the target cell and then just do a lookup. So that code I showed you where I was doing estimate at a cell,

31:21

it's just doing a lookup in a table like this. So like I said earlier, this algorithm should be better, but I have not been able to actually improve on the performance I had before. And the reason I think for that is that there are so many states that come up with the same estimate because a lot of times you're not moving the active robot and so your actual score hasn't changed.

31:42

And so I need to mix this in with some other heuristics. And I've been playing with different ones like try moving robots that I've moved recently because often your solutions only involve maybe two or three robots instead of all five. So try moving the robots that I've already moved first, some mixture of moving the active robot first and recently used. I've gotten close to the same performance

32:01

as the breadth first search, but not quite the same, which is sad, but that's how it's working so far. So I've got some ideas of where to go with this next. Like I said, I still have to, when I'm looking for stopping points, I have to figure out if there's any robots in the way. So maybe some kind of a collision detection algorithm would work there because the one I'm using right now is kind of still pretty naive.

32:22

Maybe there's a way to work backwards. When I'm playing the game myself, I do this a lot. So figure out where the goal cell is and all those red highlights are places that the robot could stop before it moves into the goal cell. So if you work backwards from there, maybe there's a way to do that. I'm not quite sure how I'd implement that yet.

32:40

Again, I talked about this already. Try moving most recently moved robots first, some combination of active and most recently moved. Maybe there's a way to choose which direction to try next more intelligently, like maybe choose the direction that would move me closer to the same row or column as the goal cell. But a lot of times you have to kind of go up and around the board to get in,

33:00

so maybe that wouldn't work, I don't know. Maybe there's a way to pre-compute the per robot stopping positions and then just update that as I go along, as the robot moves. Not yet sure about how well that would work. I haven't played with it yet. Obviously, using less objects and more primitive types might speed things up. If I find more places where I'm just creating

33:20

too many objects, that would speed things up. I've still been running the profiler and trying to optimize hotspots. I had one great one, I found it. It looked like it should be a hotspot and I added some caching and I went from three minutes to four and a half minutes, solving my game. So not an optimization. You run into a lot of this kind of stuff. That's why the profiler is your friend because almost always, when you're working with performance problems,

33:41

your intuitions about what is slow are wrong. And so the profiler basically helps you, keeps you from fooling yourself. This would be an obvious place for some parallelism. Maybe I should get raised parallel aboard and play with that and try it on there. That'd be kind of cool. I haven't played with parallelism that much here.

34:02

I could do other languages, but I mean, this is a Ruby conference, so why would I? Although I've toyed with the idea of porting this to Crystal and playing with that, which might be kind of fun. Anyway, that's where I'm at with the solver so far. I need to thank a few people. Trevor Yarish is a colleague at Zeal. He's one of our designers and he helped me with these slides. All the animations and the graphics were all his.

34:21

The parts where the slides are ugly are things that I did to it after he had his hands on it, so sorry, Trevor. The other people at Zeal, I had people pairing with me on this and giving me ideas and advice, so thanks to them. Michael Folgleman I mentioned a couple times. I got some ideas out of his talk about this. Trevor Leilish-Mannah is the guy who introduced me to this game at Ruby Decamp

34:43

a couple years ago, and I love it, and then all the screen captures are for more games. I've actually got the code up on GitHub if you want to go look at it. I've got some time for questions. I will be posting the slides to speaker deck shortly. They'll be there, and I'm also on speaker rate. If you want to give me feedback, I'd appreciate it. I always like to get better at talking,

35:03

and I think that's about it. I will mention I work for Zeal, which is a web and mobile application development consultancy in southern Oregon, and we are hiring, and we also have just started an apprenticeship program. It's a paid apprenticeship program. If that interests you at all, come talk to me about it, and I have some Zeal stickers if you want one,

35:24

and also Stellar high five stickers. We started this site, and you can actually go there and give people a virtual high five on Twitter. Just if somebody's done something awesome that you like, go give them a Stellar high five, and thank them for their contributions. So I've got some time for questions.

35:40

Thank you very much. Okay, so the question is, what profiler did I use? I was just using Ruby prof, and that actually works pretty good, and it actually gives you HTML output, and you can kind of click to jump around the different methods that are being called.

36:03

All right, so the question is, do I have any examples or ideas where some of the concepts here have affected day-to-day work? I mentioned early on there's a project I worked on where there was a hierarchical data structure that we were representing, basically like an organization chart, and we needed to be able to find, what was it, common ancestors or common descendants

36:25

from nodes in the tree, and so a breadth-first search worked great to find kind of the nearest common node. So we actually, the guys that were working on it were asking on our Slack channel about, well, how do we solve this? And I said, oh, you could try this algorithm, kind of walk them through it, and it worked great for them.

36:41

So that's one example I have. Right, so the comment is, places where you can save yourself object creation can have a huge impact. In their case, in Ruby 1.9.3, they were about 550 times faster by not creating new objects when they didn't have to. Any other questions?

37:02

All right, thank you very much, appreciate it. Thank you very much.