A Swarm of Processes — Simulating Ant Foraging Behavior with OTP - TIB AV-Portal

A Swarm of Processes — Simulating Ant Foraging Behavior with OTP

00:00

4

Ockelmann-Wagner, Will

Formal Metadata

Title

A Swarm of Processes — Simulating Ant Foraging Behavior with OTP

Title of Series

EMPEX LA Conference 2018

Number of Parts

11

Author

Ockelmann-Wagner, Will

License

CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/37886 (DOI)

Publisher

Release Date

Language

Producer

Production Year

2018

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

In this talk, we'll see a simulation of a foraging ant colony that can efficiently find and collect food, built using a separate OTP process for each ant. Along the way we'll look at GenServers, DynamicSupervisors, Registries, and more OTP goodness. We’ll also learn a little bit about ants! About Will: Will is a software developer at Carbon Five. He started his career in accounting, but found that automating his job away was more fun than the actual job, and moved over to software. He’s into typed functional programming and tiny keyboards.

EMPEX LA Conference 201811 / 11

Automatic playback

Speech

Text

Image

00:00

Process (computing)SimulationPresentation of a groupEnterprise architectureOffice suiteSoftware developerProduct (business)Client (computing)Computer animationXML

00:38

SimulationProcess (computing)Type theoryProcess (computing)Windows RegistryAerodynamicsType theoryServer (computing)Process (computing)Windows RegistryDynamical systemComputer animation

00:56

SimulationLocal GroupPlastikkarteMetropolitan area networkGroup actionComputer animation

01:22

SimulationAreaTrailDisk read-and-write headDrop (liquid)Strategy gameSet (mathematics)TrailCountingUniform resource locatorRandomizationComputer animation

01:51

SimulationCountingDistanceSurjective function

02:14

AreaTrailDisk read-and-write headDrop (liquid)Mechanism designDirection (geometry)Trail

02:49

SimulationAreaTrailDisk read-and-write headDrop (liquid)EvaporationTrailMultiplication signXMLComputer animation

03:23

SimulationAreaTrailAlgorithmAxiom of choiceVulnerability (computing)Computer animation

04:17

Strategy gameBitOnline helpStrategy gameCodeSoftware developerGoodness of fitComputer animation

04:38

SimulationCASE <Informatik>Heat transferTrailComputer animation

05:03

SimulationTopological vector spaceAlgorithmComputer programmingHypothesisKnapsack problemAnt colony optimization algorithmsMetropolitan area networkRoboticsSource codeComputer animationMeeting/Interview

05:49

SimulationMathematical optimizationAlgorithmComa BerenicesAnt colony optimization algorithmsGroup actionLogicDivisorNumberIterationBitClassical physicsState of matterInstance (computer science)Programming languageAlgorithmNP-hardGroup actionDefault (computer science)Functional programmingVirtualizationRandomizationMultiplication signOptimization problemComputer scienceGraph (mathematics)Power (physics)Travelling salesman problemSet (mathematics)Electronic program guideMathematical optimizationKnapsack problemLimit setMultiplicationTrailPlastikkarteComputer animation

07:47

Process (computing)SimulationServer (computing)Shared memoryType theoryConcurrency (computer science)Endliche ModelltheorieProcess (computing)Server (computing)CodeMechanism designUniform resource locatorDomain nameQuicksortFiber bundleContext awarenessPoint (geometry)LogicSimulationMessage passingArithmetic progressionHierarchyMultiplication signData typeCohesion (computer science)Projective planeComputer simulationComputer programmingTable (information)DatabaseCartesian coordinate systemState of matterWebsiteGame controllerLine (geometry)Functional programmingIntrusion detection systemModule (mathematics)Mobile appFlow separationMeeting/InterviewComputer animation

10:51

SimulationEllipseForcing (mathematics)Library (computing)Endliche ModelltheorieStandard deviationFunctional programmingFluid staticsType theoryBitBoolean algebraComputer animation

11:36

SimulationStrutFunctional programmingSimulationDomain nameType theoryXMLComputer animation

11:54

SimulationTesselationType theoryComputer animation

12:12

Module (mathematics)IntegerTesselationDeclarative programmingType theoryDescriptive statisticsDampingComputer animation

12:34

SimulationModule (mathematics)Data typeDifferent (Kate Ryan album)Key (cryptography)Point (geometry)Functional programmingProgramming languageType theoryPattern matchingPattern languageTesselationFunctional programmingQuicksortComputer animation

13:07

SimulationPattern languageType theoryPattern languageFlow separationoutputCASE <Informatik>Decision theoryTesselationFunctional programmingBit ratePoint (geometry)Instance (computer science)Computer animation

13:35

SimulationModule (mathematics)Type theoryTupleArithmetic meanFunctional programmingType theoryRoundness (object)MathematicsEndliche ModelltheoriePoint (geometry)IntegerModule (mathematics)Domain nameData structureComputer animation

14:13

SimulationImplementationProcess (computing)ImplementationMechanism designPhysical systemType theoryMereologyModule (mathematics)Intrusion detection systemSimulationNetwork topologyDifferent (Kate Ryan album)Computer simulationTrailCartesian coordinate systemDynamical systemError messageComputer animation

14:54

SimulationStrategy gameBitMultiplication signDifferent (Kate Ryan album)Type theoryProcess (computing)Electronic mailing listRun time (program lifecycle phase)Fluid staticsMultiplicationAddition

15:23

SimulationInstance (computer science)TupleProcess (computing)Principal ideal domainModule (mathematics)Dynamical systemComputer simulationQuicksortErlang distributionWindows RegistryMessage passingXML

15:52

SimulationTupleProcess (computing)Windows RegistryModule (mathematics)InformationData structureMessage passingDifferent (Kate Ryan album)Radio-frequency identificationXMLComputer animation

16:30

SimulationGame controllerSimulationInheritance (object-oriented programming)Execution unitMobile appProcess (computing)Windows RegistryTesselationFreewareComputer simulationComputer animation

16:56

SimulationState of matterTesselationServer (computing)GradientTupleSet (mathematics)Electronic mailing listComputer animationProgram flowchartXML

17:34

SimulationLoop (music)Point (geometry)Process (computing)Intrusion detection systemQuicksortMessage passingIdentifiabilitySinc functionComputer animationProgram flowchart

17:57

SimulationServer (computing)Semiconductor memoryTesselationElectronic mailing listSelectivity (electronic)Context awarenessDirection (geometry)LogicLevel (video gaming)AlgorithmServer (computing)DatabaseComputer animation

18:44

ClefGroup actionTesselation1 (number)SimulationProgramming languageCalculationMobile appQuicksortServer (computing)State of matterFocus (optics)Artistic renderingSinc functionComputer animationProgram flowchart

19:55

TupleData typeWindows RegistryProcess (computing)Context awarenessProcess (computing)NumberDynamical systemTupleWindows RegistryXMLComputer animation

20:19

Process (computing)Context awarenessWindows RegistryTupleData typeSimulationGoodness of fitCodeComputer simulationWeb 2.0LogicClient (computing)Context awarenessComputer animation

20:43

Windows RegistryTupleContext awarenessData typeProcess (computing)Type theoryData structureComputer programmingFitness functionProcess (computing)TesselationConcurrency (computer science)Network topologyProjective planeInheritance (object-oriented programming)CodeVector potentialState of matterSoftware repositoryFault-tolerant systemInstance (computer science)Social classServer (computing)Endliche ModelltheorieMultiplication signElectronic visual displayGame theorySoftware bugMulti-core processorCASE <Informatik>Codierung <Programmierung>BitClient (computing)Computer animation

22:49

Coma BerenicesXML

Transcript: English(auto-generated)

00:07

Hi, I'm Will. I'm a developer at Carbon 5 here in our Santa Monica office. As my colleague Andrew said earlier, we're an Agile product development company and work with clients ranging from new startups to large enterprises

00:24

who are looking to build a thing and get a little better at Agile while they're at it. We also often host the Elixir LA meetup in Santa Monica, so perhaps, hopefully, I'll see some of you there. So, today we're going to talk about ants, swarms of ants.

00:44

And because you paid to go to an Elixir conference, we'll talk about some Elixir stuff too, like processes and gen servers and registries and dynamic supervisors and contacts and types, but mostly ants. Now, a great man once said,

01:02

A person is smart. People are dumb, panicky, dangerous animals, and you know it. Luckily, we're not talking about people today. We're talking about ants. And they're just the opposite. An ant is dumb. Apologies to any ants in the room. A colony is a smart, efficient, coordinated group. So, how do they do it?

01:22

Well, the answer in short is pheromones. Now, ants are a diverse set of creatures, but one, here's a common strategy. Let's follow an ant named Alice. Alice, who, like all worker ants, is female, wanders the world in a semi-random way, searching for food. But wherever Alice goes, she is generally able to keep track of where she is.

01:42

Scientists think that some ants can keep track of landmarks, but that others actually count their steps and know their location by dead reckoning. We know this because some scientists glued tiny stilts to the legs of desert ants, gave them food, and let them go back to their colony. What they found was that the ants totally overshot the colony

02:02

because their step and distance counts were off. They also found they had been gluing tiny stilts onto ants, which is a pretty awesome way to do science. Anyway, so, once Alice comes across food, she picks up a piece and then she heads back home.

02:22

And since she knows where she is, she's able to take a direct path back. But when she's traveling with food, she leaves a pheromone trail behind her. For some ants, the mechanism is just, the food weighs them down enough that their stinger drags across the ground. But however they do it, other ants are able to smell this pheromone trail.

02:44

And so, once Alice gets back to the colony, she can drop the food off and then head back out. So now we can follow Bobby, another maybe less talented ant. She follows the same steps as Alice, but happens to find food that's twice as far out.

03:01

Still, she takes it back, also leaving a pheromone trail behind. But there's an important caveat. Pheromone trails evaporate over time. So, because Bobby's trips take twice as long as Alice's, Bobby's laying down half the pheromones at Alice's for a given hour of collecting. And because the pheromones evaporate over time,

03:21

Bobby's trail will never get that strong. This is important for our last ant, Eve. It turns out there's another step in the ant algorithm that Alice and Bobby are following. If, while wandering, a ant comes across a pheromone trail, they may decide to follow it to food, depending on how strong the trail is.

03:41

So now we can see how this comes together. If, in Eve's wandering, she comes across Bobby's weak trail, she may decide to pass it by. But when she sees Alice's strong trail, she's much more likely to get on it, follow it to food, and then bring it back, leaving her own pheromone trail on top. Pretty quickly, more and more ants make that same choice as the path is reinforced,

04:02

until eventually the whole colony has focused up on this, the closest source of food. Once the food's gone, the pheromone trail evaporates, and then all the ants can go wandering again, perhaps now finding Bobby's slightly weaker trail, and focusing on that. This is known as a recruitment strategy.

04:20

It's a way for members of a swarm to tell each other, hey, I found something good, help me out. So, that's more or less how at least some ants work. Which is interesting, maybe. But as a developer, what is maybe more interesting is that what Alice and friends were doing sounds a little bit like code. In fact, it's pretty straightforward to translate our ants' instructions into some pseudocode like this.

04:46

There are two main cases here. One where the ant doesn't have food, and one where she does. In the no-food case, if Alice sees food, she grabs it. If she sees a strong pheromone trail, she probably gets on it. And if she sees a weak one or nothing, then she carries on randomly walking.

05:04

If Alice is carrying food, on the other hand, then she just keeps going, depositing pheromones, and going towards home. When she gets home, she drops off the food and heads back out. So this is pretty algorithmic. As it turns out, we're not the first people to notice that ants seem sort of similar to computer programs.

05:22

Back in 1992, a man named Marco Dorgio, who, as you can see, is into both ants and robots, which I can relate to, came up with what he called the ant system for his PhD thesis. This was the start of what's now called ant colony optimization, or ACO,

05:41

which is a method of applying Alice's ant algorithm to tricky problems like the traveling salesman and knapsack problems. This is the general formula for ACO. There's a lot of Greek in here, but all it really means is, for a set of possible moves, the probability of picking one of them

06:00

is the amount of pheromone deposited on that move to the power of some influence factor, which is 2.0 in the literature by default, times the desirability of the move, which in traveling salesman might be related to distance or knapsack related to value, also taken to a factor, divided by the value of all the other possible moves.

06:24

So, for instance, in the traveling salesman problem, this is a classic hard problem in computer science. A traveling salesperson wants to visit a bunch of cities and do it in the least amount of time possible. So how do they choose which order? Well, it turns out there's no easy way to figure it out,

06:40

but ACO provides a pretty good approach. With ant colony optimization, you have a bunch of virtual ants randomly traverse a graph of cities, making sure to visit each one once. Afterwards, each ant drops a pheromone trail on its path, with the strength of the trail corresponding to how short the trip was.

07:01

Since the ants use the ACO algorithm to choose moves over multiple iterations, the ants coalesce onto the optimal solution. Pretty smart for some ants. So, ACO is an interesting technique, but if you want to do that kind of number crunching, Elixir is not your language. But a few months ago, I was reading about ACO, because that's how I roll on a Tuesday night.

07:23

And it got me thinking about something else. An ant has a little bit of state, it knows its location, and if it has food, it can take a limited set of actions, like moving, depositing pheromones, and grabbing and dropping off food. And the ACO algorithm that determines the ant's next state

07:41

is just a function of its current state, plus its surroundings and a little bit of randomness. That means ants are like actors. Not that kind. But they do fit pretty well into the actor model that Erlang and Elixir use for concurrency. Which got me wondering, can you simulate a forging ant colony using Elixir processes with Genserver?

08:03

So, the short answer is yes. It works, and we'll see that later on in the talk. But that's not really the point here. The point is that, if you're like me, you spend a lot of your time writing CRUD applications, even in Elixir, and don't actually explicitly drop into OTP that often.

08:20

When I'm trying to solve a problem, my instinct is to reach for a database table. So today I want to walk you through a sort of sufficiently complex example of how you might go about building a program where you're thinking in actors, supervisors, and data types instead. You might not be writing an ant simulation, because you won't be able to compete with this one,

08:41

but maybe you'll be inspired for your next problem. So, in Phoenix 1.3, as we heard earlier today, introduce the concept of context as a way to organize your code. As a reminder, instead of putting all your logic in a flat hierarchy of schemas and controllers, 1.3 encourages you

09:02

to pull your business logic out into separate, cohesive bundles of structs, functions, and modules, called contexts. So, following 1.3 best practices, we'll start with defining some contexts. Coming up with good context for your domain is definitely an art, and it's one that I personally find kind of challenging.

09:21

But these are the three that I came up with for this project. Ants, worlds, and simulations. Ants are definitely their own thing. The ants context will have logic for choosing a move based on locations it receives, and keeping state, stuff that ants know about. But outside of the ants sightline, there's a whole world.

09:42

The world knows about locations of food, even if no ants do, and can report to an ant what's immediately around it. That should definitely be separate from the ant code, and ants ought to only be able to query data from the world using its defined API. So, that's clearly another context.

10:00

And then finally, outside of the simulated world, there are the mechanisms of actually running a simulation. Spinning things up, shutting things down, assigning IDs, sending messages to cause the ants to do things. So, we can keep the things that we're actually simulating focused by taking the simulation itself out into its own context. I also ended up with sort of a shared context

10:22

that holds a grab bag of stuff that was used across the app, which is maybe a code smell, or is it a code pheromone? No, that's not good. So, now that we've got our base context figured out,

10:41

is to set up our data types. The structs and other things that define the domain. Then we can figure out how the data will be interpreted and transformed as the simulation progresses. So, a reasonable place to start might be to determine, to define our humble ant. Like we discussed earlier, the ant only really knows two facts.

11:02

Where they are in the world, and if they're carrying food. So, our ant model is equally simple. It knows its XY coordinates, and it has a food boolean. You'll notice I also defined a type for our ant. Types obviously aren't required in Elixir, but declaring types for structs lets the static type checker dialyzer

11:21

be a lot smarter about knowing, or using them, and is also a nice bit of documentation. In the Elixir standard library, as well as type functional languages like OCaml, it's idiomatic to name the main type in a module T. That way, you can say that a function takes an ant.T,

11:41

and it's clear what you're talking about. So, the few other types in this domain will focus on the world of the simulation, and how an ant will interact with it. So, what does that world look like? Well, for this simulation, the world is pretty simple. It's a grid of tiles, surrounded by impassable rocks.

12:02

The world is mostly empty land, but it has a few brown food tiles for ants to find, and a pink colony for them to start at and return the food to. So, given this, how would we represent it in types? Well, they're pretty much just the English descriptions. Land tiles can have pheromones on them, food tiles can have food to be collected,

12:23

the home tile has food that's been collected, and rocks don't have any particular data associated with them. For brevity, I've skipped the type declarations here, but pheromones float, and food is an integer. So, given those types, we can define a tile.T type which can be any one of the four sorts of tile.

12:43

This is a key concept in type functional programming, and it's called a discriminated union, or a tagged union, disjoint union, variant, or some type, depending on the language. But whatever you call it, the point is that a tile.T could be any sort of tile, but we can use pattern matching to figure out which one it is.

13:00

Being able to pattern match out the type is what makes a tagged union different than a simple union type. So, for instance, because we know that we have a tile.T in this rating function, we can make some decision based on what type it is. Obviously, this kind of pattern matching on a thing is common in Elixir,

13:21

but it's nice to give the technique a name, or several in this case. And because I've declared the possible types of a tile, it's easy to verify that the rating function is a total function, in that it handles all possible inputs. A couple other types, points and moves, round out the main-domain model.

13:42

Both of these are defined as pairs of integers, but points are x, y coordinates on the grid, whereas moves are changes to x and y that go from negative one to one. So even though they're internally represented in the same way, they have different modules, so we have a clear place to put functions that operate on one or the other.

14:01

We have types that say what an x or a y means in a particular function, and if someday we want to change one of the data structures, we can do all that work in one place. So, now that we've got some types, we can sketch out how the system will work. For the most part, I'm not going to dive into the actual implementation of the modules,

14:24

but you might be interested in seeing some of the mechanics of managing a bunch of processes. So, here's the supervision tree for this application. I want to be able to run multiple simulations at once, so the two top-level processes are a SimID agent that keeps track of the IDs for the different running simulations,

14:43

and then a simulations supervisor, which is a dynamic supervisor that can spin up many child simulation supervisors and restart them when they have errors. A quick note on dynamic supervisors. They've been mentioned a couple of times today. Originally, they're a new addition in Elixir 1.6

15:02

and replaced the old simple one-for-one supervisors. Both kinds let you create many child processes at runtime rather than supervising a static list of processes. The only difference is that the new dynamic supervisors have a little bit nicer syntax, and they can supervise multiple different types of children.

15:23

But whatever sort of dynamic supervisor you use, there's always a problem. There's naming. With an Erlang process, you need to know its name or PID to send it messages. That's easy for a process that you only have one instance of, since you can give it a global name, like often the name of the module it's defined in,

15:42

like with the simulations supervisor. But for a process that you're going to have a bunch of, like an ANT, that doesn't work. Instead, we'll need a registry and a via tuple. So, you can do a lot of different things with registries, but the one that matters here is naming processes. Using the registry, you can construct a via tuple,

16:03

which is a data structure that can be used to uniquely identify a process. That way, even if a process is dynamically started and then later restarted with a new PID, you can still pass messages to it. A via tuple has this structure. It's a three tuple of the Adam via, the registry module,

16:21

which is a thing in Elixir, and then some tuple that starts with the name of a module you set up as a registry, and then some uniquely identifying information. So with that, we can start simulation supervisors. Each one of these controls the supervisors for a given simulation, super well named, so they can be taken down as one unit.

16:45

Simulation supervisors can be referred to by their via tuple, which includes the SIM registry, that's the registry for this app, the simulation's ID, and then an Adam saying, hey, this is a simulation process. For each simulation, we have a tile supervisor that supervises a bunch of tiles, 400 for a 20 by 20 grid.

17:05

Each tile is a gen server that holds the tile.T data in state and can be told to add pheromones or remove food from its state. The via tuple for tiles includes the XY coordinates of the tile. That's important because it means that for an ant

17:21

at a certain set of coordinates, it can easily look up all the tiles around it. For an ant at 11, all the tiles from 00 to 22. That kind of look up would be really inefficient if the tiles were kept in a big list. So we also have a process for each ant, since that was sort of the point of the exercise.

17:41

But since there's nothing identifying about an ant, the coordinates change, there's no XY that you can send a message to. It needs a little ant ID agent to give a unique identifier for each ant and then be able to tell you which IDs you can loop through to send a message to all of them. The ant gen servers, like tiles, keep an ant.T in memory.

18:02

They can take external commands to move themselves or to deposit pheromones. To decide where to move, each ant asks the world's context for its surroundings, which is just a list of tiles. The ant then does a weighted random selection from that list using the ACO algorithm

18:23

and goes in that direction. Since the tile gen server logic is in the world's context, the ants are able to work with tiles without needing to know how they're persisted. So if we wanted to change our mind and decide to store tiles in a map or a database, because that's in the world's context,

18:41

the ants' context wouldn't have to change. So, those are the highlights. With a simple tree, we're able to spin up hundreds of ants and tiles, having them work together in a useful way. So, now that we've gone through that, let's actually see this thing in action. Oh good, it worked.

19:00

So, this is a gif, but of the running simulation. You can see that the ants at first sort of spread out into the world, quickly found the first one, and you can see the yellow trails as they lay down pheromones and focus up, getting further and further ones away.

19:22

For the, each turn takes about six milliseconds to run, and the way it works is the app is waiting every 50 milliseconds, is requesting another turn from the server, it's doing all these calculations and then sending back some state that it's rendering. Since we heard about Elm earlier, I'll mention that this is a Reason React app,

19:42

which is another super functional front-end language, but a topic for a whole other conference. Yeah, so, I think that's kind of neat. I don't know. All right, so, what did we learn? First of all, this is how Skynet starts.

20:03

And OTP supervisors already know how to terminate. All right, we've learned that dynamic supervisors, registries, and via tuples are useful for handling large numbers of dynamic processes. We've learned that answer pretty cool,

20:21

it's a good note. We've learned, keep your business logic in your contacts. We didn't even look at the Phoenix web code in this talk because there's nothing to see. The turn endpoint just immediately calls out to the simulations contacts, which orchestrates telling all the other contacts to do their work, and then kicks it back to the client.

20:44

And it's useful to declare the types of your data structures, including structs, and you can name the main type in a module T. And finally, you can do a lot with processes. The funny thing about this program is even though we're running all these things, it's essentially synchronous.

21:01

Every turn, we wait for all the ants to move, then we have them place their pheromones, then gather everything up for display, then respond to the client. So, why go through all the trouble with gen servers? One potential benefit would be concurrency. If you can, on a multi-core processor, you can have ants all working at once,

21:20

maybe you'll get some performance benefits. In this case, things actually got faster when I turned the concurrency down a little bit. But a more important gain is fault tolerance. Without processes, if one of these ants had a bug, no pun intended, they could corrupt the state of the entire simulation and bring it down.

21:42

Both every ant is in its own process if an ant manages to wander off the edge of the world as they did for a while, the supervisor is able to kill that ant and restart it back at home. If we needed to keep this thing up for a long time, that would be really useful. Finally, it's a nice way to think.

22:01

I'm simulating independently acting ants, so it's nice to be able to model my code that way. In some ways, this almost feels object-oriented. Essentially, we've got a bunch of instances of ant and class tiles, or, I'm sorry, instances, yeah, of ant and tile classes, each with its own state so that you can call to update them. So, we get some of the benefits of OOP, nice models,

22:24

but without the drawbacks of mutable state and crazy inheritance trees and all the other reasons that we are all at an Elixir conference. So, going all in on processes isn't going to be the right fit for every project, but I think it's a good fit for more projects than we realize. That's it.

22:41

If you want to crib any code or play with ants yourself, that's all in the repo here. Thanks.