We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Learning from FP: Simulated Annealing in Haskell and Ruby

00:00

Formal Metadata

Title
Learning from FP: Simulated Annealing in Haskell and Ruby
Title of Series
Number of Parts
65
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language
Producer

Content Metadata

Subject Area
Genre
Abstract
Haskell is a functional programming language that cleanly separates pure algorithms from messy real-world concerns. To learn how it ticks, I've translated an algorithm for Simulated Annealing from Haskell to Ruby. It also gave me an excuse to play with ruby-processing, a toolkit for graphics processing ;). Come learn about Functional Programming and how it can make your Ruby better!
Simulated annealingFunctional programmingComputer programmingParameter (computer programming)Electronic mailing listFunctional programmingMixed realityElement (mathematics)Term (mathematics)Time zoneNeuroinformatikComputer programmingAlgorithmBitHypothesisCodeNatural numberWell-formed formulaCASE <Informatik>GradientFunktionenalgebraConstructor (object-oriented programming)RootIndependence (probability theory)Computer animation
Functional programmingType theoryWikiOrder (biology)Computer configurationFunctional programmingFood energyPhysical systemStandard deviationBitVariable (mathematics)Slide ruleProgramming languageSemiconductor memoryMedical imagingMoving averageSimulated annealing
Revision controlModul <Datentyp>QuicksortDigital filterImplementationRecursionPivot elementOrder (biology)Pattern matchingElectronic mailing listVariable (mathematics)QuicksortFunctional programmingAreaType theoryIntegerLine (geometry)Pattern languageBitArrow of timePhysical systemCASE <Informatik>Multiplication signTime zoneElement (mathematics)Programming paradigmKeyboard shortcutComputer fileContext awarenessInteractive televisionPairwise comparisonProgramming languagePoint (geometry)MereologySocial classStructural loadState observerView (database)SpacetimeParameter (computer programming)Operator (mathematics)Design by contractRight angleCanonical ensembleRepresentation (politics)Sinc functionComputer chessInfinityAlgorithmGame controllerSystem callArithmetic meanInterface (computing)Interrupt <Informatik>Radio-frequency identificationRAIDInheritance (object-oriented programming)Inclined planeString (computer science)Limit (category theory)Computer-assisted translationBus (computing)Computer animation
Revision controlIntegerIterationPhysical systemElement (mathematics)Electronic mailing listProgramming languageInfinityGame controllerFunctional programmingAverageType theoryMessage passingInclined planeError messageNumberString (computer science)Lecture/Conference
String (computer science)Computer virusType theoryMessage passingCountingImplementationType theoryReplication (computing)Electronic signatureMultiplication signOrder (biology)String (computer science)AliasingSign (mathematics)Functional programmingMessage passingCountingGroup actionResultantMappingLine (geometry)Library (computing)Reading (process)Letterpress printingCartesian coordinate systemPartial derivativeError messageParameter (computer programming)HypermediaRevision controlComputer animation
Function (mathematics)Line (geometry)Parameter (computer programming)Arrow of timeType theoryFunctional programmingExpected valueElectronic signatureL-functionDifferent (Kate Ryan album)AreaMultiplication signOrder (biology)MultiplicationMathematicsOperator (mathematics)NumberProtein foldingPositional notationLine (geometry)String (computer science)Cartesian coordinate systemLetterpress printingStandard deviationMonad (category theory)ImplementationElectronic mailing listLevel (video gaming)Perspective (visual)Bit ratePhysical systemCodeTerm (mathematics)Arithmetic meanPRINCE2Wave packetProcess (computing)Data storage deviceInsertion lossTape driveLevel of measurementInternetworkingCASE <Informatik>ResultantMoment (mathematics)Computer animationLecture/Conference
Simulated annealingShape (magazine)Data structureFood energySimulated annealingProcess (computing)Energy levelMathematicsMoment (mathematics)PhysicalismPhysical systemComputer simulationOrder (biology)View (database)EmailMultiplication signAtomic numberSoftwarePoint (geometry)Time zoneGreen's functionDisk read-and-write headQuicksortState of matterDifferent (Kate Ryan album)Computer animation
Food energySimulated annealingSimulated annealingPhysical systemFood energyMultiplication signFunctional programmingOcean currentMathematicsMathematical optimizationRandom number generationState of matterNumberGreatest elementEnergy levelDifferent (Kate Ryan album)Maxima and minimaAlgorithmPoint (geometry)Variety (linguistics)1 (number)PlanningResultantRandomizationTable (information)Computer simulationRight angleDirection (geometry)Similarity (geometry)Archaeological field surveySemiconductor memoryMedical imagingDivisorImplementationEntire functionComputer animation
Glass floatFood energyRandom numberSimulated annealingMotion blurSlide ruleImplementationMathematical optimizationElectronic signatureType theoryMultiplication signAlgorithmSimilarity (geometry)Food energyGeometryDiagramResultantVariety (linguistics)Archaeological field surveyDifferent (Kate Ryan album)CodeTriangleGodPoint (geometry)AreaView (database)NumberBitComputer animationLecture/Conference
TesselationCoefficient of determinationTrianglePoint (geometry)Food energyGeometryFigurate numberMathematicsConnected spaceLine (geometry)Denial-of-service attackGraph (mathematics)Computer animation
Simulated annealingGlass floatPoint (geometry)PolygonLink (knot theory)StrutFood energyFunction (mathematics)Data structureElectric currentGraph (mathematics)Line (geometry)BitFood energyProbability distributionPhysical systemConnected spaceRandomizationDistribution (mathematics)Ocean currentType theoryMultiplication signGroup actionGreatest elementAlgorithmFunctional programmingData structureElectronic mailing listComputer programmingElectronic signatureMathematical optimizationSocial classWebsiteSpacetimeExecution unitCombinational logicGoodness of fitSoftware testingPosition operatorPoint (geometry)PolygonIntegerSimulated annealingImplementationCoordinate systemAbstraction1 (number)Computer animation
PolygonCompilerCode refactoringGoodness of fitMultiplication signType theorySoftware testingSoftware frameworkPhysical systemFluid staticsFactory (trading post)INTEGRALComputer animation
Point (geometry)Simulated annealingSet (mathematics)DreiecksnetzPolygonStrutGeometryLink (knot theory)Food energyDigital filterMathematicsCodeGeometryMultiplication signPhysical systemMathematical optimizationLink (knot theory)Food energyElectronic mailing listKeyboard shortcutSummierbarkeitType theoryResultantMeasurementInformationImplementationElectronic signatureInteger1 (number)LengthInjektivitätFunctional programmingLevel (video gaming)Bit rateInsertion lossFamilyMedical imagingData managementOcean currentMereologyComputer animation
Electric currentINTEGRALSimulated annealingMathematicsRandom numberLinear subspaceLink (knot theory)Digital filterStrutRollback (data management)Random number generationSound effectOcean currentLocal ringFunctional programmingGoodness of fitInstance (computer science)Limit (category theory)Line (geometry)Musical ensembleState of matterFood energyPhysical systemExpert systemLink (knot theory)BitNumeral (linguistics)Denial-of-service attackOrder (biology)NumberVariable (mathematics)Moving averageExponentiationGeneric programmingRandomizationMaxima and minimaMathematicsNormal (geometry)IterationDifferent (Kate Ryan album)MathematicianDampingCalculationMultiplication signFigurate numberType theorySimulated annealingFitness functionEnergy levelScaling (geometry)Electric generatorIntegerImplementationRollback (data management)System callComputer animationLecture/Conference
PermianLink (knot theory)Digital filterSimulated annealingStrutRollback (data management)Random numberBitLocal ringPhysical systemLink (knot theory)Functional programmingCodeSound effectType theoryMultiplication signComputer animation
Sound effectParallel computingPolymorphism (materials science)Motion blurEllipseProcess (computing)Installation artType theorySound effectProcess (computing)Multiplication signNatural numberPolymorphism (materials science)Visualization (computer graphics)Library (computing)Physical systemEllipseFunctional programmingCodeSimulated annealingFrame problemRight angleMaizeTouchscreenSubsetLipschitz-StetigkeitComputer animationLecture/Conference
Food energyElectric currentIterationCircleSimulated annealingFood energyMultiplication signArithmetic progressionPhysical systemLink (knot theory)Right angleState of matterScheduling (computing)AlgorithmGraph (mathematics)Computer animation
Simulated annealingElectric currentMotion blurFood energyLink (knot theory)Band matrixTriangleClique-widthPoint (geometry)Graph (mathematics)Link (knot theory)Food energyGraph (mathematics)Simulated annealingScheduling (computing)RectangleArithmetic progressionPoint (geometry)TriangleProcess (computing)AreaElectronic mailing listPolygonMereologyFunctional programmingMathematicsInflection pointComputer animation
Food energyVolumenvisualisierungUtility softwareProcess (computing)Simulated annealingFilm editingBootingShared memoryPhysical systemArithmetic progressionBit ratePhysical lawDigitizingFood energyRandomizationDisk read-and-write headBitHistogramJust-in-Time-CompilerComputer animationLecture/Conference
Food energyTrailSimulated annealingMonad (category theory)Link (knot theory)Hash functionComputer programmingSoftwareLink (knot theory)TwitterSelf-organizationCodeGodTrailProduct (business)Computer animation
Transcript: English(auto-generated)
Hey, I'm Josh. I'm going to talk to you about functional programming with Haskell.
But first I want to talk about this concept of procedural versus functional. So let's say I need to make a cake or a cupcake. It's very normal for me to think about a recipe. I've got a list of ingredients. I've got step one, step two, step three,
you know, mix the dry things together, throw the wet things in there. I don't know how to make a cake, so here are the instructions. And that's a pretty normal way, a very natural way for us to think about making a cake. But it's also perfectly natural and perfectly valid. Well, maybe not natural, but perfectly valid for us to think about a cake as being a formula. It's a construction of independent elements.
So a cake is some cake, or at least a cupcake is some cake plus frosting. And that cake is in itself a formula, which is dry ingredients plus wet ingredients plus baking. So if you think about it, it's a totally valid way also to think about what is a cake. It's maybe this list of instructions, or maybe it's this formula.
And there's actually value in getting outside of your comfort zone and thinking about making a cake in terms of a formula. Because if I know that a cake is a cake and frosting, then I know that, well, if I need to make a million cakes at once, then I can say, okay, you guys are the frosting crew. You're just going to make frosting all day long. And you guys are the cake crew. You're just going to make cake.
And then I'm going to combine those two together to make cupcakes. And it's going to work really well. And then if the cake crew gets overloaded, and they can break that down and say, okay, you're the dry ingredient team and you're the wet ingredient team, and we're going to bring this together. And then you're the baking team, and we're going to bake those. And when you look at it from the top down like this, you can see that there's these easily parceled out bits of work to do. And so I would argue that that's kind of how it is with functional programming.
When you look at programming from, and it's really a backwards way of thinking about it from the OO side, if you look at programming from this functional space, then you can conceive of your code more simply and think about it in more objective terms. It's more easy to see the computation that's going on, the algorithm that you're dealing with, as opposed to how you got there.
And so that's kind of the thesis for this talk. And it's kind of what got me into learning Haskell in the first place. So Haskell is a functional programming language. In order for me to actually show you how I solved the simulated annealing problem, which I'm going to get into with Haskell, I do need to teach you a little bit of Haskell. So bear with me for the next few slides. Haskell is a purely functional programming language.
It does not let you, I mean, all its variables are immutable. Thank you. You can actually technically do memory references and mutation, but really it's immutable. It's purely functional. It's got a really amazing type system. I got so excited yesterday when Mats talked about an optional type system for Ruby.
I was like, yes, this is exactly what I want, because you'll see, types are actually really, really beautiful when they're done the right way. And so that's very exciting to me. The Haskell Wiki, if you look it up, says it's a modern, standard, non-strict, purely functional, yada yada. It's a language. You'll see it's actually pretty simple.
So what's some Haskell look like? So here's a very basic bit of Haskell. You can see up top that f is a function that takes an integer and returns an integer. So if we read that together, the double colon says is a function. So f is a function that receives an integer, and then the arrow means I'm going to return an integer. And the implementation of f is that f for some x is two times x.
Very easy to read, not too far outside of our comfort zone just yet. So if I save that line one and two there in this file called f.hs, I can actually load that file into interactive Haskell. GHCI is like IRB. It's your REPL, essentially, for Haskell.
So Haskell is, as a matter of fact, a compiled language. So what this REPL is doing is compiling on the fly. So if I load f.hs, and you can see that it's colon L f.hs, that loads the file into the REPL, and then I get access to that function that I've defined. So if I run f space x, I get 10. So the way you apply arguments to functions in Haskell is you just put a space.
Haskell is a really clearly defined precedence system, order of operations. So it's always pretty obvious what's going on. Okay, so the canonical example for Haskell is quicksort. If you're not familiar with the quicksort algorithm, quicksort is a way to sort an array. You basically take some pivot element of the array,
it's usually just the first thing in the array, and then you divide up the rest of the array into things that are less than that pivot and things that are greater than that pivot. And then for things that are less than, you go and quicksort those. And for things that are greater than, you go and quicksort those, and then you kind of add it all together. So it's a recursive function. The definition is recursive. And it's a very nice way to sort things.
So in Haskell, it's a very simple implementation, a very straightforward implementation because Haskell is really good at recursion. So let's read this together. So this says that quicksort is a function that, now here's a new thing, for some orderable a, I'm going to receive a list of a's and return a list of a's. So here you get a little bit of taste of the type system in Haskell.
There's these things called type classes, which in here you can see orderable, ORD is one of the type classes. Type class is basically like an interface. It's a contract. It's saying that if you are a type that is inside of this type class, then you're going to implement certain things. And so for orderable, that means that I can return some ordering of two of you.
So if I have integers, it's very clear, less than or greater than. If I have strings, then we can use lexical comparison. If I have cats, I can define my ordering, whatever I like. But it can all be a part of this orderable type class. And then quicksort will work on anything that's orderable. So that's really nice. So I'm saying anything to the left of the fat arrow is defining contracts that the types to the right have to conform to.
And then anything in lowercase is an abstract, is essentially a type variable. So a is anything that conforms to the orderable type class. So let's keep reading. So quicksort is a function that for some orderable a takes a list of a's and returns a list of a's. And then Haskell does pattern matching. So I've got two definitions of the quicksort function here,
and it just pattern matches by what it's receiving top to bottom, basically. So the first definition of the quicksort function says that if I receive an empty array, then the definition is just an empty array. That's my base case for my recursion. And the other definition is if I receive basically anything else, I'm going to take the first thing and the rest. And this is a pretty common paradigm in Haskell.
I'm going to essentially pattern match on an array, put the first element in p and the rest in x's. And then the definition of that is I'm going to quicksort everything that's lesser, I'm going to concatenate that with p, and then I'm going to concatenate that with quicksorting everything that's greater. And so notice at this point I haven't defined what lesser and greater are. So actually what the where thing here just below does
is it introduces local bindings. So inside of the context of just this function, I'm now defining these concepts lesser and greater. And so lesser is anything that filter, if I run the filter function, everything that's less than p on x's, and greater is filter everything that's greater than or equal to p from x's. So again, I'm not expecting you to understand everything that's going on here,
but I wanted you to see that Haskell's a pretty easy language to read and to understand at least from an observer's point of view. Again, if I load this into GHCI, I can load quicksort.hs, and I can run that function, and it works for me. Haskell's also a lazy language. Haskell will only evaluate as much as it absolutely has to
to get you the right answer. So here I can happily define an infinite list. It's zero dot dot infinity. And then Haskell will just spit out all the values forever. If I hit control c, it'll stop. That's why you see it interrupted. There's a function take which says I'll take n elements from some list. So Haskell, because it's lazy,
I can just take four elements from the infinite list, and I'll just get the first four elements. I can take four elements from the list of all even numbers. Here I can define an iteration system. Haskell also has list comprehensions. Here I'm going to take the first ten things from this list comprehension, and I don't need to get into detail, but it's basically trying to show you
that you can define some arbitrary infinite list that might generate useful values for you and that you can use those later on. So back to Haskell's type system. Like I said, it's really powerful and really flexible. So here I'm creating a function bangit. Let's read this. bangit is a function that receives a string, a character, and an integer, and returns a string.
And I get back to all those arrows. I know it's a little weird. So my implementation of bangit is I'm going to receive a message, a bang, and a count, and I'm going to return a message concatenated with, and then I'm going to replicate bang count times. Replicates a built-in function to Haskell. So you can see as I run it, the type of bang here, I've loaded it again into ghci. I've just delighted that set.
So there's a colon T in the ghci, which lets me look up the type of a certain thing that I'm asking for. So if I look up the type of bangit, I see that bangit is a function, like I mentioned above, that takes a string, a character, and an integer, and returns a string. And if I run bangit on hello with a bang and four, I get, hello! And that's exciting. So that's nice, but it may not be clear
if I'm trying to share this function with somebody outside of me or my little working group. So what I can actually do in Haskell is I can define type aliases. So here I've said that I've defined four new types in Haskell. Bang count, message, bang, and result. And those are just aliases for int, string, char, and string again. And then I can actually change the type signature of bangit
to say that bangit receives a message, a bang, and a bang count, and returns a result. And then it's much clearer to me down here in ghci when I look at the type of bangit, I say, oh, well, bangit receives a message, a bang, and a bang count. And as long as these are concepts that I'm using in my domain, then this should make sense to people. Like if I'm working with orders,
as long as I say that this function receives an order, I know, okay, I've got an order here, I can just add it to it, as opposed to some bag of data which might be just like a bunch of strings or something. Additionally, these types, these type aliases, because they're just aliases, they can work with anything from the rest of the standard library. So here I'm importing the data.char library and I'm using the toUpper function
over the result of bangit and I'm mapping everything to uppercase. A quick thing here, the dollar sign is essentially like a left parenthesis that assumes that the very rest of your line is up into the right parenthesis. It's just a way to kind of keep it clear and clean. You'll see a lot if you start reading Haskell, so I wanted to make sure you saw that.
Sorry. Okay. Partial applications. So Haskell is, like I said, functional, and here's where I get to talk about those arrows. So those arrows basically say I'm going to, actually, I'm going to create a function that takes one argument and that's going to return you a function that takes another argument and that's going to return you a function that takes another argument.
So we kind of make that all clear by just putting a bunch of arrows in line. It's really sugar. It's really easy to read. Again, you don't have to really understand the details, but I just wanted to show you how this works. So if I give 3 times 5, I get 15. It turns out that the multiplication operator is really just another function and it's an infix function. It comes in between its arguments. So if I wrap an infix function
in parenthesis in Haskell, I get a prefix function. And so I can actually say, what is the type of the multiplication function? And it's for some number, a, I take two a's and I give you an a. And so I can also run it that way in a prefix notation. I can say, multiply 3 and 5 and I get 15. This will be familiar to lispers out there.
I can also say, what is the type of multiply just 3? And here I see that it says the type of the multiply times 3 function that you've just created here is for some number a, you give me an a and I give you an a. And so I can actually wrap that into a function here and say, mul3 is a function that for some number a receives an a and
returns an a. And the implementation of mul3 is that for mul3 with n, I'm going to return you multiply 3 by n. And then I can run that and you see it gives you mul3 with 5, it gives you 15, just like you might expect. I can also leave off that n because I'm really just returning a function. I don't have to give it the rest of the arguments. It has to figure out
that this needs another argument. And so the type signature is the same, but I can leave off the n on both sides. It's more like math in this way. And it still works. One more thing I want to talk about with Haskell is higher order functions. So a higher order function is a function
that receives a function as one of its arguments. And so fold is a function that you're going to see a lot. There's a number of different kinds of folds in Haskell. There's a left, right, and strict. I'm not going to get into the details. Really, a fold is exactly the same thing as an inject in Ruby. And there's subtle differences in implementation, but it's basically the same thing. So here I'm saying fold L, the
plus function, starting from 0 over 1, 2, 3, and 4. And that's the same thing as giving an array 1, 2, 3, 4 inject, starting at 0, and just adding the arguments together. So you're going to see that a couple times if you read Haskell. That's the kind of high level idea. If you look at the type signature, it's actually really interesting. So the fold L function
is, this is saying fold L is a function that receives a function and a b, and a list of a's and returns a b. And that function that I'm receiving, that function has to have the type signature that receives a b and an a and returns a b. Now there's a b and an a because fold
is generic. You can actually give it different types of things, and it'll theoretically work together as long as you've got the right function in the middle. In this case, the plus function just takes two things of the same type. So really, the signature for the way I'm using it here is that I'm going to give it a function that takes two numbers and returns a number, and I'm going to give it a start number, and I'm going to give it
a list of numbers, and then I'm going to get a number out. So when you have this kind of strong type system, you can actually reason very clearly about your code, and it's actually a lot easier to figure out what to plug in where to get things to work together. There's quite a lot of time that I've been writing Haskell, and I'm like, what's the type of that thing? Oh, okay, how do I just need to line up the types? What do I need to do just to get the types to work? And then as soon as those
types line up, it just works. That's really pretty great. One more thing with Haskell, I just want to talk about IO. IO is pretty straightforward. You can see here we're going to define a function main. Main is a function that returns an IO over the empty over the null, nullary concept. And then for main, I'm going to do print string. I'm going to get the name
out of the standard in, and I'm going to put that back out. Now, that's kind of weird syntax, isn't it? What is this IO thing? Well, I am not going to cover monads. But you don't really have to understand monads to read the rest of this. Just, they're there. So I'm just going to kind of
gloss that over. I highly recommend Tom Stewart's talk about monads. It's on the internets. It's really great if you want to understand them better from a Ruby perspective. They're really handy, but not today. So basically, the way that this works is basically how you'd expect. If you run the main function, it asks you, what's your name? I type in Josh, and it says, hello Josh. That's all you
really need to know for now. Okay, that's all the Haskell that I need to teach you at the moment. What I really want to talk about is annealing. Actually, I want to talk about simulated annealing, but in order to talk about simulated annealing, I need to talk about what annealing is. So annealing is actually a physical and chemical process that matter goes through and it
does stuff. So when I think about it, slight aside, I write software for a living. Actually, most of the time these days I write email for a living. It's very abstract, very virtual. Sometimes it's really nice to get your hands dirty and do something. When I have free time, which I haven't had a lot of, I like to make jewelry. I like to metalsmith
or bang on metal with hammers. If you do that for a while, if you're banging on a piece of metal, you can make it in all kinds of interesting shapes. But if you keep banging on it long enough, eventually that metal is going to start to crack. And what's going on here is that the molecules of that metal they start off kind of in this happy state and then as you bang on them, they kind of get twisted up together into this unhappy
state. It's called a high energy state. And so at some point the energies are going to get so high that those molecular bonds between them are going to start to come apart and that's when the cracking happens. So annealing is a process of putting some physical structure under temperature for a while and then essentially that additional temperature
gives those molecules in the structure the freedom to start kind of moving around a little bit. And if you don't make the temperature too high, then the shape of the overall structure doesn't change. But the atoms get a chance to settle back down into a happier state. And then you can continue to bang on that metal as long as you like again. So you can see here on the far right
that it's kind of a scanning electron microscope view of a metal that has not been annealed on top. And you can kind of see all those grains in between things. And then after we go through an annealing process on the bottom you'll see that the grains have settled out and it's a much more easily workable system. So simulated annealing is really that same idea. But it's done in software. So
it's applicable to all kinds of different kinds of problems. So let's just read through this algorithm and here's some Ruby finally. We're going to the Ruby. So we take some initial state. We define an energy function that tells us what is the overall energy of that system. We'll get into that in a sec. And then for some amount of time we apply this annealing function. And so this annealing function
I'm going to get a current temperature of the system. I'm going to try to change this state in some way. I'm going to look at the next energy of that next state. And then I'm going to run this probability function which takes the difference of the states and the current temperature and returns me some value between zero and infinity. And if that value is greater than some random number
between zero and one, then I'm going to switch to the next state. There's a number of things here that I've kind of glossed over. So first off, the energy function. How do you define an energy function for a system? Well it depends on what kind of system you're trying to anneal. So annealing is really useful for a variety of problems. One of the first ones that comes to mind is wedding planning. I've got a bunch of tables and I've got a bunch of people that are trying to see these
tables. I got married two years ago. I never want to try to do that again. Although this time I might use simulated annealing because it's kind of interesting. So basically I define an energy function between any two people in my system. That energy function is how dissimilar are these people or how much are they going to hate each other or something. I can issue them a personality survey and do some math on that
or whatever. So if they're high energy they're going to be unhappy sitting next to each other. If they're low energy they're happy sitting next to each other. So I randomly distribute those people throughout the tables. I sum the energy of the entire system. I look at how unhappy everybody is basically. And then that gives me my current energy. I'll come back to the temperature thing. A mutation of that state would be that I swap
two random people in the system. So I might check the energy, swap two random people, check the energy at that point. So the question is has the energy gone down or gone up? The thing with simulated annealing is sometimes you want the energy to go up. So that's why the probability function is important. So the probability function says, okay, what is the difference in the energies and what is the overall
current temperature of the system? And the result of the probability function needs to be that for anything that lowers the overall energy of the system the result should be greater than one. It should always mutate in that direction. But sometimes you want to mutate into these bad directions. As we see with refactoring, sometimes you need to go through a bad place to get to a good place. That's kind of the idea with simulated annealing.
So sometimes for any value that's increased in energy, the more of the increase, the closer that value should be to zero. And then sometimes for increases, that random thing is going to let us actually go in that direction and see where it takes us. We're going to run down that road and see if we get to a good place or not. So, and then here's
where the temperature function comes into play, is at a higher temperature, I want to get those values closer to one, and at a lower temperature I want to get those values closer to zero. And over time, early in the system, if my temperature is high, I'm going to be more likely to run down these scary bad roads that might take me to an overall higher energy state. But as the temperature of the system falls,
I want to take less time exploring those bad roads and spend more time on the good ones and just try to get it into a better local state. The way the Wikipedia explains this is that I'm trying to avoid a local maxima and trying to find a global maxima. If you think of a chart of mountains and you're trying to climb the highest mountain, if you just have a very naive algorithm
that's just going to climb until it sees a downward slope, it's going to find some random mountain that might not be the highest mountain. So sometimes you need to go down off that mountain and look for a bigger mountain. That's kind of the idea of simulated annealing. It's a search for what is the overall lowest energy state of that possible system. You're not guaranteed to get it out of simulated annealing, but you're
probably going to find a decent one. And then how that temperature function comes into play, over here on the right you can see there's two images. If I just kind of run my temperature down really fast and don't search for some of these really bad states, I'm going to get a system that looks more like the top. If I run my temperature down in a very slower way and I look for an overall better place, I'm going to get a system that looks more like the bottom.
It's basically how much of these local maxima do I want to play with and am I going to be okay with them versus do I want to find an overall global system. I'm not going to get into a lot of the definition of how to choose a good temperature algorithm. I'm just going to show you one that works for me. So this is the same definition of the algorithm in Haskell. I've left out the
implementations of the temperature probability mutation and energy functions. I just want to show you those type signatures. I really don't have enough time to go through all this. It's there. For reference when you look at the slides later. So what am I annealing? Actually, so this talk is really inspired by
this guy from the DC area, Conrad Barsky. I haven't been able to get a hold of him, but he wrote this great tutorial about Haskell and in it he found a picnic. He wanted to make a picnic. He found a park in DC and he gave 200 people a personality survey. So this park is what you see here. This is a diagram of the park and the data
in the back is the result of that personality survey. The personality survey is I think it was 20 questions or so that had answers that were between one and five for a variety of different things. And then he defined the energy function to be that for any two people how similar are they or how dissimilar are they really. A high energy is more dissimilar or a low energy is more similar.
And it's really just by lining up those numbers. The two people answered two, then they're more similar. If they answer different numbers, they're less. And the idea is if you put all these 200 people in a park and you put them in the right place and they're going to have a great time and everybody's going to be happy. So a little bit utopian. I think. It was kind of fun. To get there though,
we have to take this park and cut it up into picnic lots. So the code that I've got on GitHub that runs through this does that. So there's a lot of geometry involved where we cut up that picnic thing into triangles and cut it up into lots. And then take those lots and find the center points of each lot. And then take those center points of each lot and figure out who can
talk to who from each lot. I'm not going to go into any of that, but it's kind of fun if you want to play with some geometry or some interesting math. At this point though, we can randomly place those people into those lots and our energy can be for every connection between one of those lots what is the energy of those people. And we can sum that all up. So when we just
randomly distribute those people into those lots, we get a graph that looks kind of like this. So the red lines are high energy connections and the blue lines are low energy connections. So it's kind of random. Which is basically what you'd expect from a random distribution. And the idea is that after we anneal this we can come out with something that looks more like this. Where we end up with a lot of blue, just a very little bit
of red, and a much happier system and a happy picnic. So how does that work? So, first off I want to talk about the data structures that I've used. So in Ruby we've got six classes. They're all just basically structures. Simple things like a point, which is an X and Y coordinate. A person, which is a list of answers. This polygon
is a list of points. A polygroup was a useful class that I needed to deal with an array of polygons to cut them up and allocate things. A placement, which is one of the most important ones. A placement is a combination of a point and a person. So I know that for this XY position I've got this person, which is a list of answers. And then the overall park, which is a list of placements. People and positions.
On the Haskell side, I'm using the type system to say basically the same stuff. I've got a point, which is two floats. I'm not going to get into what that is. It's a tuple. It's just two floats. Think of it that way. A person is a list of integers. A polygon is a list of points. I've got some of those type aliases for time allowed, current time, things like that. And down at the bottom
I've actually defined some types for the kinds of functions that I'm going to use to implement the simulated annealing algorithm. And so the real beauty of the Haskell implementation is that the simulated annealing algorithm is totally unbounded by specific type. It's very abstract. It's generic. And so the way we do that is we've actually annotated some of these functions with abstract types.
The energy function type there on line 15 says that I'm going to have an energy function that works on some a type. And it's going to take whatever that a is and return an integer, which is the overall energy. And then as long as I conform to that type signature I can him this energy function as something that expects an energy function and it'll just work. And so this is really nice.
Of course, no program will be really well done without tests. So with Ruby I used RSpec. Surprise. With Haskell I just compiled it. There is a really good test framework for Haskell called HUnit. It's very effective, but the truth is that Haskell's type system is so good
that 80% of the time if it compiles it's going to work. Sam's laughing because it's true. I know. So that's actually really nice. Actually when I wrote this in Haskell I did a lot of refactoring and every time I kind of went down a wrong path I was like, well if it just compiles
I know it's going to work. So that worked out pretty well. Yeah, static types are great. Like I said, there's a lot of math in here around geometry. I'm really not going to spend a lot of time on there. Just know that if you do look at the code there's a lot of that. Here's some of the interesting stuff. So the energy function, like I was talking about, measure the energy of the system. So on top you see
the Ruby. So the energy function of a park is that I'm just going to grab all of the energies of the various links in that system and I'm going to sum them. Inject with zero and sum. I'm going to use this mismatches function that's defined on person here and I'm going to basically clobber together all the answers, select the ones that are different and look at the length. And that's my energy for any given link.
The Haskell implementation is basically the same. My picnic energy function is going to take a list of links and a current placement, which is the park, and return an integer. So I just when I define the picnic energy function I see that I receive links and a park and the result is sum the link energies. Well that's actually nice. I don't know what the sum the link energies means beyond
that it's summing some link energies. I can kind of get some information out of that. And then if I want to dive in and understand that better I can look at the local bindings. I can look at the things in the where clause. So sum the link energies where link energies is map some link energy function over all the links. And what is that link energy function? Oh it's right there. Link energy function says for an A and a B I'm going to
look at the mismatches between this person and that person. Find person is just a helper that I elided here but it basically just gives you back those lists of answers. And then mismatches is a function that for two people it gives me an integer and it's really doing the exact same thing as the Ruby. I'm not going to spend much time on the uncurry thing that's going on there but it's basically doing the exact same thing as Ruby. I'm just turning the
type signatures around so they line up and so they work together. The temperature function like I mentioned you want to get this slope because at the early on you want to try all kinds of crazy ideas and then you gradually want to shift into let's just work on the hard problems. Sounds like a startup.
So an easy way to put that together is using the exponent function. This is just a normal mathematical exponent thing. What we're doing here with the 50 and the 5 is just flipping exponent from going this way to going that way so that it lines up the right way. The current is the current energy of the system. What iteration
are we on? And the max is the max. So we're just scaling that to be okay. I'm not a mathematician. I mostly like to write math by drawing lines and telling other people to figure it out. I like to play with graphing calculators though so that's kind of how I figure out what to do. On the Haskell side picnic temperature is a function that is a temperature function.
This is where the type system is actually really nice. If we want to expand that we can see in the comments picnic function is a function that receives a time allowed and a current time and returns a temperature. And if you want to expand that again you can see picnic function is a function that takes an integer and an integer and returns a float. But we don't have to know all that. We just have to know that it's a temperature function. So as long as I
have a temperature function I can give this thing that wants a temperature function a temperature function and it should work. That's the nice thing about good types. And the implementation is really the exact same as the Ruby implementation. I'm going to use the exponent function in the same way. The from-integral thing there is a generic numeric type coercion function. Integers are different from floats. In Ruby you use like 2f
to convert from an integer to a float. In Haskell if you just say from-integral some number it will convert it to whatever it needs to be. The probability function like I mentioned this needs to be greater than one for all lower energy states and something between zero and one for all higher
energy states. And again the exponent function is exactly what you're looking for here. Just the normal exponent function says that for any exponent of x, if x is less than zero then I'm going to be somewhere between one and zero on a slope. And if x is greater than zero I'm going to be somewhere greater than one on the y-axis.
So I'm just taking the difference of energies, dividing it by the current temperature and that scales that thing that function to fit essentially to give you things that are more towards one early on when the temperature is high and that are more towards zero later on when the temperature is low. The Haskell implementation is the same again here we're using the type system
the probability function is a probability function which is a function that takes two energies in a temperature and gives you a probability, etc. I'm showing on line 7 above and line 7 below how you would use this probability function. The only thing that's really interesting here is that to generate random numbers in Haskell you have to have
a random number generator and it's going to turn another random number generator and oh yeah I'm not going to talk about monads. It's not hard it's just a little bit weird if you see it. Okay the mutation function just really briefly in Ruby the interesting thing is that I've got some mutable state
and I'm going to mutate it when I call the mutate function so I'm going to take two random people in my system and swap them. If I don't like where I'm going I'm going to roll back that mutation to the previous state. So I'm using this local instance variable inspected to keep that around and then I can roll back and my simulated annealing system in Ruby
knows to call rollback because it's going to mutate. In Haskell though I'm not mutating I'm just simply returning a new object, a new state so I don't have to roll anything back. So my mutate function can take a park and give me a new park that's randomly swapped a couple of its links and I don't have to do anything I can just discard that if I don't like it.
Again you can see there's some random things going on there I really don't want to spend too much time on some of the details there but I'm happy to get into it with any of you later afterwards if you like. One of the interesting things to note is that the walking neighbors thing up top so the links that I defined earlier on the system are kind of who I might talk to but you see
they've left that center island out so I've just kind of created another system that lets me connect things a little bit more. It's another function that's just a little bit more connective. So if you're reading the code later and you're like what is this it's just that I could guarantee that I could connect that center island to the rest so that people can move around because I'm really only mutating local neighbors I'm not just swapping two random people from the grid I'm just going to swap somebody local.
I don't have enough time to talk about all the other things that are really interesting from Haskell. There's no side effects everything's pure that's really great. Mutation like I mentioned is not really a big deal. Haskell's really great at parallelizing because of that functional nature I can just distribute the work up. It's also
really great with polymorphic types so I'm not going to spend a lot of time on all that. I do want to talk about Ruby processing though because I made a really fun visualization of what goes on with the annealing system and I did it in Ruby processing and hey this is a Ruby conference so let's talk about some Ruby. So Ruby processing is a gem that runs in JRuby only because it's interfacing with the processing
library which is written in Java. The processing library is a simple drawing library basically. It's fast, it's interactive, it's really easy to get things done. So a quick example on the right there that's all the code you need to get something ready to go in Ruby processing. So I define some setup function which gives me the size of my thing. I fill it with white
or I set my fill when I do draw to white and then draw gets called for every frame so I'm going to set my background to zero which is black and then I'm going to draw an ellipse at my current mouse's x and y coordinates that's 100 by 100 and you see when I took that screenshot my mouse was in the middle of that ellipse of course you don't screenshot the mouse in OS 10 but whatever.
Anyway you can see it's pretty easy to get going. So I made this thing. So you see it's the park at the top with the current energy for all the links. Right here we go. The park is the top. I've got all the links there. I've got the current links that I'm swapping for this iteration of the annealing system in circles. I've got a progress bar which shows me how far along
I am on the annealing thing. Again this needs to be a time bound algorithm. You're not going to find the absolute best state. You're just simply going to find the best given the amount of time. Down below I've got some interesting graphs. The one on the left shows me all the link energies sorted into a histogram. It's kind of nice to see it dance around as we go. And on the right I've got the temperature, the annealing schedule
of what is my current temperature and where am I going to be going on the graph. If you want to see how this is drawn, again Ruby Processing makes it really easy to draw these kind of things. My park is already a list of triangles. Triangles are three points. So what I really do here is I iterate through each of my polygons and I use the triangle function
from Processing and give it six points and it draws a triangle. And it all looks like it's one big thing because all my triangles are contiguous pretty much. And that works out pretty well. My progress bar is really just two rectangles. I've got one big white rectangle for the overall area and then I've got a little grey rectangle for the current progress. Again, it's a pretty easy to use API.
The hardest part about using Processing is doing the math. Okay, so now I get to show it off. So I kick it off with RP5 run annealing render. Annealing render is the name of my sketch. RP5 is just the command line utility to kick off Ruby Processing.
And it takes a little minute to boot because it's JRuby. Hopefully JRuby 9000 cuts that down. And if I zoom in. Where's my mouse? There we go. Alright, so you can see it dance around. I don't know if you noticed at the very beginning though it was a little bit slow and then it got faster and faster.
That's JRuby doing the JIT which was I hope you got to check out Charlie's talk about that. But you can see my energy histogram is dancing around a little bit as I try all kinds of random things earlier on. As the system progresses it's going to dance around a lot less and only really change the high energy stuff as much. You can see the little red dot coming along as my temperature falls.
And eventually as this goes you'll notice that the system gets a lot bluer. So, that's really all I had. Alright, I've got a million links
if you want to learn more. Feel free to check it out. I'm not going to leave this up there, but I'll tweet out the deck. And, thanks! I'm Josh. I'm Jschmider on Twitter. CTO of Someplace and organizer of all the meetups in DC.
Just quick things, the code is on GitHub. I highly recommend you check out the Exorcism Haskell track. If you don't know what Exorcism is, oh my god you've been missing out on some great stuff. Check it out. Learning about Haskell is really great if you want to learn Haskell. And, a huge thanks to Conrad Barsky for essentially teaching me Haskell
through his great Haskell tutorial on Lisperati.com.