RustConf 2020 - Closing Keynote
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 10 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/52198 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
RustConf 20205 / 10
3
00:00
RhombusGame theoryNintendo Co. Ltd.JSONXMLComputer animation
00:46
CuboidGame theoryGame theoryInternetworkingMultiplication signHypermediaWordArchaeological field surveyComputer programmingRevision controlExpected valueGoodness of fit2 (number)
02:49
AreaGame theoryMetropolitan area networkDependent and independent variablesElectric generatorBitWater vaporMenu (computing)Computer animation
03:40
CASE <Informatik>Right angleGame theoryMaxima and minimaLevel (video gaming)NumberVolumenvisualisierungComputer animation
04:22
Software bugSet (mathematics)Bit ratePhysical systemQuicksortView (database)Point (geometry)Order (biology)Constraint (mathematics)Game theoryTesselationCodeDisassemblerMultiplication signWater vaporSoftware bugStructural loadRight angleElectric generatorGrass (card game)Coma BerenicesOcean currentFerry CorstenAreaSequenceRandomizationEvent horizonRandom number generationElectronic mailing listProgrammer (hardware)Greatest elementBit rateCASE <Informatik>Category of beingSingle-precision floating-point formatControl flowDifferent (Kate Ryan album)Dependent and independent variablesCoordinate systemExistenceComputer animation
07:38
Bit rateVariable (mathematics)Computer programGrass (card game)Multiplication signGrass (card game)Software bugWater vaporGame theoryTable (information)Semiconductor memoryAddress spaceVariable (mathematics)TesselationNP-completeCASE <Informatik>InformationVideo gameAreaComputer programmingOcean currentLevel (video gaming)NumberHexagonArithmetic meanPositional notationSheaf (mathematics)Translation (relic)Object-oriented programmingReal number1 (number)Different (Kate Ryan album)CodeThumbnailTouchscreenContext awarenessLine (geometry)BefehlsprozessorQuicksortBit2 (number)CompilerMechanism designTime zoneAssembly languageType theoryPhysical systemRevision controlBit rateConstraint (mathematics)ResultantNormal (geometry)Modal logicStack (abstract data type)Latent heatComputer animation
15:25
Area2 (number)Computer animation
15:55
WordLetterpress printingDecimalGame controllerString (computer science)Network operating systemMetropolitan area networkTable (information)Multiplication signCodierung <Programmierung>AreaGame theoryGrass (card game)Level (video gaming)Bit rateWater vaporNP-complete
17:25
Metropolitan area networkMetropolitan area networkCodeFlow separationNormal (geometry)EinsteckmodulLetterpress printingTable (information)Game controllerCondition numberoutputGame theoryGrass (card game)SubsetComputer animation
18:39
Content (media)Price indexGame theoryTable (information)StatisticsSubject indexingNumberCASE <Informatik>Pointer (computer programming)Exception handlingMereologyOrder (biology)Bit rateRoutingTime zoneCodeLevel (video gaming)Reading (process)Element (mathematics)AreaNP-completeIntegerElectronic mailing listAttribute grammarWordLetterpress printingRun time (program lifecycle phase)Intrusion detection systemMedical imagingSoftware developerPoint (geometry)Film editingElectric generator
22:55
Content (media)Game controllerLevel (video gaming)CodeConstraint (mathematics)QuicksortNP-hardTable (information)Binary codeProjective planeRevision controlService (economics)Software developerPivot elementReal numberAdditionGame theoryReading (process)Position operatorComputer configurationStatisticsOrder (biology)Macro (computer science)CASE <Informatik>Optical disc drive1 (number)Auditory maskingDecision theoryVirtual machineNumberCache (computing)BefehlsprozessorSoftware bugRaster graphicsSemiconductor memoryBoolean algebraTrailSlide ruleFunctional (mathematics)Point (geometry)Computer iconInformationElectronic mailing listRow (database)TouchscreenSelf-organizationDifferent (Kate Ryan album)Computer programmingRandom number generationASCIIBitCorrespondence (mathematics)SpacetimeLatent heatStatement (computer science)Sound effectSource codeData storage deviceEinsteckmodulEmulatorElectric generatorExistential quantificationWeightMultiplication signFree variables and bound variablesData structureResultantInteractive televisionRule of inferenceDescriptive statisticsValidity (statistics)WordException handlingTranslation (relic)MassSoftwareMereologyCodePhysical systemSinc functionCalculationLoop (music)Buffer solutionVideo game consoleBuffer overflowMachine codeSet (mathematics)outputBound stateData compressionNintendo Co. Ltd.Entire functionComputer animation
Transcript: English(auto-generated)
00:16
Oh hi there. I didn't see you. I want to tell you a story.
00:22
20 years ago, in the year 2000, a game came out for the Nintendo 64 called Hey You Pikachu. For Christmas that year I was hoping to get the special Pikachu edition N64, but my mom told me that in a few years I wouldn't even still be into Pokemon. Here I am 20 years later, still playing Pokemon, and today I want to share that love with you.
00:43
And mom, this talk is for you. Hey everybody. My name is Sean Griffin. Text encoding is hard, so sometimes it's spelled like that.
01:01
My pronouns are they, them. Let's talk about Pokemon. The first Pokemon game was made by a small team for Japanese audiences. The game was made on a tiny budget, and the programming team was only four people. In 1996, Pokemon Red and Green were released, and sales vastly exceeded expectations. Later that year, an updated version was released in Japan with improved graphics and more polish.
01:24
After it was clear that this game was far more popular than anybody expected, there was a mad rush to localize it for international audiences. Two years later, in 1998, Pokemon Red and Blue were released to the rest of the world and would go on to be the highest-grossing media franchise of all time, eclipsing even Mickey Mouse and Hello Kitty.
01:42
In fact, it became so popular that even if you've never played Pokemon, I'll wager you've seen this one before. This is an actual picture of Ryan Reynolds from 1998. Okay, not really. This is Pikachu, by far the most famous Pokemon, but there was a close second. This is Missingno. Missingno is a glitch Pokemon, and you could only encounter it through a glitch.
02:03
But the thing is, everybody knew about this. One of the things I find so fascinating about Missingno is just how widespread it was. In a survey I ran, 87% of people who owned the game knew about the glitch when it was relevant, and 80% of those people heard about it through word of mouth, not the internet.
02:24
And there was a good reason. Missingno could duplicate items. Now this glitch had a lot of names, the Missingno glitch or the item dup glitch. At my school, it was called the rare candy glitch, since most people used it to duplicate an item with that name.
02:40
It made your Pokemon more powerful whenever you used it, so it was a really desirable item to duplicate. Let's take a look at how you performed the glitch first. We're going to start off in Viridian City, one of the earliest areas in the game. And we talk to this old man, he's going to ask if we're in a hurry, and we're going to tell him no. In response to that, he's going to say, oh, cool, why don't I show you a tutorial about how to catch Pokemon?
03:03
And he's going to go into this battle and find a Weedle, and he's going to attempt to catch the Weedle. We actually are in a little bit of a hurry, so we're not going to sit through and watch this. Next, we need to fast travel to an area called Cinnabar Island. Now, this being the Generation 1 Pokemon games, the way fast travel worked is we transform into a bird and just fly away.
03:25
Once we get there, we want to go onto this water, so we're going to do that by opening up the menu, and this being the Generation 1 Pokemon games, we're going to transform into a giant seal thing? I don't really know. We go up and down this coast, and eventually we'll get a wild encounter.
03:41
If you played these games, you might notice this pause right here is way longer than it's supposed to be, and we're going to go into why that is a little later on. So here we see our friend Missingno. It's level 168, which is higher than you're supposed to be able to encounter in the game. The maximum level is 100. We're going to immediately run away from Missingno, open up our inventory, and when we go down to the sixth slot in our inventory, which is where I had the rare candies,
04:02
we'll see that I now have flower 2. I had four rare candies in my inventory before this started, and this clearly was not meant to render numbers larger than 99, so I guess the way it renders 13 is flower, because 4 plus 128 is 132.
04:23
If you've never seen this glitch before, this probably seems like an extremely random sequence of events for such a specific outcome. And it is! But let's break down each piece of this. As with most major glitches, there's no single bug that's responsible here. This happens because of a bunch of different bugs,
04:41
and in most cases you can't even really call them bugs, just properties of the code being used in unexpected ways. Now I want to state up front, I did not work on this game, nor have I interviewed the programmers who did. I have spent a lot of time looking at disassemblies of the game, and I think we can infer a lot about what was intended from reading the code and knowing about the constraints that they worked under,
05:03
but I want to make it clear that a lot of this is speculation. With that out of the way, let's start going through each of the pieces of this glitch. I'm going to go through them in the order that I think they were initially discovered. So the first thing you might be wondering is, what's up with those coast tiles? Why do we go to that spot specifically and go up and down the coast?
05:22
In the Pokémon games, there's sort of a grid system that the player occupies. This is one tile that the player is standing on, and they can move one tile up, down, left, right, etc. This is that same tile when we remove the player from it. Now, even though from a gameplay point of view, this grid system occupies single tiles,
05:42
the code actually sees the game a little more fine-grained than that. It sees this tile as four subtiles. And for the tile the player is standing on, these are the coordinates. The upper left subtile of where the player is standing is coordinate 8, 8, and the bottom right is coordinate 9, 9. Now, whenever you're moving along these tiles in the game,
06:02
it's going to continuously be checking to see if you can encounter a wild Pokémon. There's two main ways that you could encounter Pokémon in the first generation of games. You could be surfing on water, or you could be walking through tall grass. There was also fishing, but it worked completely differently and is unrelated to this glitch, so we're just going to pretend it doesn't exist.
06:21
This is what the code for that looked like. First thing we're going to do is load up the tile at 9, 9, so the bottom right subtile where the player was standing. Then we're going to check to see what kind of tile it is. We need to figure out how likely we are to run into a wild Pokémon. If the tile is grass, we're going to use the grass encounter rate for the current area.
06:40
If the tile is water, we're going to use the water encounter rate for the current area. And if it's neither, then the player cannot have an encounter with a wild Pokémon here, so we just exit out. Then we actually need to compare this to a random number generator, which isn't actually random, but we're not going to get into that today, to determine if we actually get an encounter. I've left that code commented out here because it's not actually relevant to why this bug occurs.
07:06
Next they're going to load up the tile the player is standing on again, but this time they're loading the bottom left subtile. And then we're going to determine which kind of Pokémon they can encounter. So if the tile is water, we're going to select one from the list of water Pokémon for the area,
07:20
otherwise we're going to select one from the list of grass Pokémon for the area. And so because they load up the tile the player is standing on twice, and the second time they use a different subtile, whenever we're on a tile that looks like this, where the right side is water and the left side is land, every time it's doing one of these checks,
07:41
it thinks we can do an encounter because we're on water, but because the bottom left subtile is not water, it's going to load up the grass Pokémon. Now when we look at this bug in Rust, I think this is the easiest bug that we're going to look at today to scoff at, say this should have been caught in code review or I wouldn't have written this.
08:02
And it really does stand out in Rust. You're assigning the same variable twice and there it's on the same screen and it's different numbers, like this just sticks out like a sore thumb to me. But they didn't write this program in Rust, they wrote it in assembly. And Rust loses some of the nuance of what happened here. You wouldn't have written this code in Rust in the first place.
08:22
There's absolutely no reason that you would have assigned tile a second time. But in assembly, you don't just have however many local variables you want. When you write a program in Rust, the compiler is going to determine where to store every variable that you write. It's either going to assign it to a register,
08:42
sort of like a global variable that your CPU uses, or it's going to put it on the stack. Now in the Pokemon games, they did have a stack, but it was really tiny, only 207 bytes. So they basically never used it unless it was absolutely necessary. The main place it was used was for audio playback.
09:04
Now in this bit of code that I commented out, this is where a lot of the context gets lost. First of all, these two lines just don't appear on the same screen. In the assembly version of this, they're actually about 50 lines apart, so you wouldn't see them both at the same time. That alone, to me, makes it much more reasonable that this would have just slipped through code review.
09:23
If I can't see both of these at the same time, I'm much more likely to just not spot that. Now, the Game Boy used a variant of what's called Z80 assembly. And we're not going to go too much into the minutiae of what that means. What's important about the differences between various assemblies
09:40
is different types of assemblies have a different number of general purpose registers. The sort of global variables that your CPU can use for literally anything. And on the Z80, they only had four registers that were truly general purpose. And in this code that I commented out, they used all of them.
10:01
So, they had to load up the tile again. They could have stuck it on the stack maybe, but you don't want this to be the one place where, oops, we don't have enough stack anymore. So they just reloaded it. And that seems perfectly reasonable to me. Frankly, if you had to work on this constraint, could you write your whole program with only four global mutable variables and nothing else
10:20
and avoid bugs like this? I sure as hell couldn't. Okay, so the end result of all of this is on these tiles specifically, the game tries to have us encounter a wild Pokémon because it's water, but we instead encounter grass Pokémon. But that's not by itself particularly useful. So let's move on to the second piece of this.
10:41
Whenever you enter a new area, the game has to load up the encounter tables for the area that you're currently in. There's one spot in memory that's just, this is the current area's grass encounter information. So the first thing it's going to do is grab how likely you are to encounter grass Pokémon in the current area. It's going to grab that from just some global section of ROM.
11:01
I've made that a constant here in the Rust translation. And that's going to check if that number is greater than zero. And if it is, it's going to copy the encounter table over. And that does the same thing for water. What's important here, though, is what happens when the grass encounter rate is zero. When it's zero, we just don't do anything. It doesn't zero out the table,
11:21
doesn't replace it with some dummy data. It just leaves whatever was there before. So this means that in these areas with these coast tiles, as long as that area doesn't itself have grass Pokémon, we can use this to encounter grass Pokémon from other areas. And that's why we specifically do this glitch on Cinnabar Island,
11:40
as opposed to anywhere else in the game that has these coast tiles with the land on the left side. Because we can fast travel to any town, and Cinnabar Island is a town, it's easy for us to fast travel there and get to this coast in particular without ever passing through an area with grass Pokémon.
12:01
So that means as long as we can use the game's fast travel system, we can use this glitch to encounter the grass Pokémon from any other area. Now that by itself isn't necessarily the most useful thing in the world. It was used for something very specific. There's a place in the game called the Safari Zone where there's a bunch of really rare Pokémon that you kind of want to get,
12:23
but it uses its own special encounter mechanic that was really annoying and everybody hated. But with this glitch, you could just go into the Safari Zone, travel to Cinnabar Island and go on these coasts, and then you would be able to encounter the Pokémon from the Safari Zone with the normal wild Pokémon encounter mechanics.
12:42
And this was really useful. This wasn't as widespread as Missingno, but a lot of people did know about it, and this was called the Fight Safari Pokémon Glitch. We don't just want to catch Safari Zone Pokémon. That's cool and all, but we want more. We want 128 Rare Candies. If we're going to do that, we need to do this glitch
13:01
when the grass encounter table contains information that isn't a real Pokémon encounter table. Now, when this glitch was originally discovered, it actually wasn't done the way I demonstrated it to you. It was done by trading with an NPC in the lab on Cinnabar Island. NPC stands for non-player character.
13:23
The Pokémon games have a system where you can send a Pokémon to another trainer and get one back in return, and there are a few NPC characters who will do this without you having to have any friends in real life. Now, this game was really constrained on memory, and everything had a very specific spot in memory
13:41
where it was stored, and oftentimes those addresses were reused, and this is one of those cases. Whenever you traded with another player or NPC, that person's name was stored in the same spot as the grass encounter table. So to see why this is useful to us, let's look at how encounter tables
14:02
are actually stored in memory. So like I said, everything in this game just has a very specific address, and the address for the grass encounter table for the current area the player is in is D887. That 0x means it's hex notation. Now, this is a table with 10 entries.
14:20
The first byte is the encounter rate, how likely you are to run into a wild Pokémon. The lower this number is, the more likely you are to have an encounter. Then we have, for each of these 10 slots, a pair of two bytes. The first byte is the level of the Pokémon in that slot, and the second byte is the ID of the Pokémon in that slot. And each of these slots have a fixed percentage chance
14:42
of running into them. So the first two are about 20%, the next one's about 15%, and then about 10%, and so on and so forth. So let's take a look at what a real encounter table looked like. Address CFA3, you'll find the encounter table for an early game area called Mount Moon. So we copy over the encounter rate, 10 is really low,
15:00
so we're likely to run into a lot of Pokémon. And then the three most common Pokémon that you'll run into is the Pokémon Zubat. And then after that, that looks like a rocky Zubat to me. I don't know. So then we finish this, once we copy over this whole encounter table, actually for this one in particular, something might stand out to you. It's all Zubat.
15:20
I don't know, I guess the game developers were like, hey, should we maybe put some Pokémon here? And they're like, oops, nope, all we've got is Zubat. It's all we got, sorry. And so kids would go through this area, and every two steps, they would see another fucking Zubat, and they would go sleeping in their dreams. All they could see is these whores of Zubat, and they would hear... Actually, I'm sorry, y'all.
15:41
Can you hold on one second? Hey, hey, have you not seen the news? There is a pandemic going on outside. Come on! Thank you. Sorry about that, Zubats, I swear. All right, so as I said, once we've got the encounter table copied over,
16:01
this is what it looks like for Mt. Moon. But of course, we don't want to run into a lot of Zubats because, well, that would just make me sad. So let's look at what happens when we copy over this trainer's name. Now, all NPCs that can trade with you in the game are for some reason just named trainer and nothing else.
16:21
So we're going to look at what happens when you copy over the string trainer. Now, Pokemon Blue used its own custom text encoding. So trainer is actually only a single byte. There is a control character that is print the word trainer. And then the end of name marker control character is 80 in decimal.
16:40
So we copy this over. We copy over trainer, which is the first byte, and that just gets ignored because we're using the water and counter rate anyway. And then 80 happens to be the ID of missing no. And so this is a really great way to do the glitch because you get an encounter table that's just all missing no, all level 80 missing nos specifically. And it's great, but the problem with doing it this way is that when you trade with an NPC,
17:02
you can only do it once. And once you've traded with that NPC, you can never do it again. So you can encounter a lot of missing nos this way, but as soon as you go to an area with grass Pokemon, you're not going to be able to do the glitch this way again. So we want something that we can actually repeat because we don't just want 128 rare candies. We want 128 rare candies as many times as we want.
17:22
Because rare candies are delicious. So that's where the old man glitch comes into play. Now it's called the old man glitch because, well, his name is old man. And it's really this, what his name specifically is isn't that important. What's important here is that he has a name.
17:42
Because this game was optimized for code size, they didn't want to have the code be too large on the cartridge or they might have to double the size of the ROM available to them on the cartridge. And given the budget they were working under, they just couldn't afford to do that. So everything was optimized for code size. So this tutorial could have been implemented
18:01
as go to a completely separate piece of code that goes through this tutorial, or you could do what they did, which is have it go through the normal battle code and just add like three or four conditionals to the battle code of, hey, is this the tutorial? If so, don't accept player input here. Now, the problem with doing that is
18:21
there's code in there that does things like print the player's name. But since the player is not in control here, we don't want to print the player's name, we want to print old man. So they need to copy over old man to where the player's name is stored, which means they need to store the player's name somewhere. And where did they decide to do it? You guessed it, the grass encounter table.
18:43
So for this demonstration, I set my name to Hi Sean. And let's take a look at what happens when we interpret that as an encounter table. So the H is going to get copied over into the rate, and that just doesn't matter again, because that's just ignored. And then I is 168, and A is, hey, that's a missing no. And then another 168, and hey, another missing no.
19:01
And then we got level 174, and wouldn't you know it, they're all missing no. It's almost like I specifically picked a name that only had missing no characters in it to make it easier to demonstrate this glitch to y'all. What a coincidence. Now, of course, each of these missing no came from a different letter in my name,
19:21
which is weird, because those are different IDs, and so therefore they should be different Pokémon. They shouldn't just all be missing no. So let's talk about what even is a missing no. Contrary to what you might think, it's not a single Pokémon. There are actually 39 distinct Pokémon which are called missing no. And it's not just reading garbage data.
19:42
Even though its sprite is clearly garbage data, it has a well-defined name. It's printing missing no there, not just random garbage. And it's not like in the code they could say, hey, is this garbage if so print missing no. You can't really detect what garbage data is at runtime, so this is clearly in a table somewhere that it's expecting to print this word out.
20:03
And a lot of other attributes that it has are well-defined as well. To understand why some of its attributes are garbage but others aren't, we need to see how Pokémon are stored in the code. When most people think of a list of Pokémon, they think of the order that they appear in the Pokédex, the in-game encyclopedia.
20:23
Every Pokémon has a number associated with it, and they're loosely ordered in the order that you would encounter them in the game. But that's not how they're stored in the game. When most people think of Pokémon no. 1, they think of Bulbasaur, because that's the first in the Pokédex. But actually the Pokémon with the ID of 1
20:40
is called Rhydon. In the code, most of the data related to the Pokémon is stored in the order they were originally created. The game was supposed to ship with 190 Pokémon at one point in development. 40 of those got either cut or saved for another generation, and then one got added at the very last second.
21:02
And so Missingno. is what's stored in the slots where the cut Pokémon were supposed to be. Now for the most part, those entries where Missingno.'s data is is just zeroed out. It's all zeros. There are some exceptions, like its name. But for example, its Pokédex ID is zero. Its cry, the sound that it makes when you encounter it,
21:21
is almost always zeroed out. So for anything that's ordered by internal ID, we're going to get well-defined but zeroed data. But for anything that's stored in Pokédex order, we're going to get garbage. And let's look at why that is. One of the things that's stored in Pokédex order is the Pokémon base stats table.
21:41
This includes things like its attack and defense and HP, what it can evolve into, what moves it can learn. And importantly, it also contains the pointer to the sprite. So the first thing the game is going to do is look up the Pokémon's Pokédex number from its internal ID. And for Missingno. that's zero.
22:01
Now because Pokédex numbers start at one, they subtract one from that to turn it into an index into an array. This is an unsigned 8-bit integer, so this is going to underflow and we're going to get 255. Or you could think about this like we were trying to get the Pokédex entry for a hypothetical 256th Pokémon.
22:25
Now the array that we're indexing into only has 151 elements in it, so we read way past the end of the array. And in the case of Missingno's sprite, where it ends up reading is the middle of some data for NPC trainer parties in an area called Route 17.
22:42
And when you interpret that as a pointer, it points to some code related to the Safari zone. And so Missingno's glitched-out sprite is what you see when you interpret that code as if it were an image. But most data in the game isn't stored in Pokédex order. That's the exception, not the rule.
23:02
Ironically, the Pokédex itself is one of the things that isn't stored in Pokédex order. So Missingno even has a valid Pokédex entry. Well, almost. You can see it's got a name and a description and a height that looks like a placeholder, but then the weight is just sort of a random number. Its entry wasn't localized,
23:21
and the structure of Pokédex entries in the Japanese version of the game was a little bit different than the English version. If we look at the Japanese version though, we can see, oh, yes, no, there's clearly valid data here for this version of the game. It's the question mark, question mark, question mark Pokémon. Its weight is 10, its height is also 10 because height was in decimeters for some reason.
23:43
And then that description translates to comment to be written. There are some differences between the different Missingno's though. In fact, a lot of them have unique data. The cry that a Pokémon has, the sound it makes when you first encounter it, is stored as three bytes, one that's just sort of the base sound,
24:01
and these are shared between multiple Pokémon, and then another byte for a pitch adjustment, and then another byte for speed adjustment. And nine of the Missingno's have cries that aren't zeros. And a few of those actually have base sounds that aren't heard anywhere else in the game. And this supports that some of these Missingno's are, in fact,
24:20
Pokémon that were just cut during development. There's also a few places in the game where they needed to display a sprite as if it were a Pokémon, but the sprite they wanted to display isn't associated with any real Pokémon, and so some of the Missingno entries are where they store those sprites. These would only show up if you had a lowercase W, X, or Y in your name though,
24:41
so most folks never saw these. And this is really important. Which version of Missingno you saw was based on your name. And this is also why, if you did this glitch, in addition to Missingno, you would see some high-level real Pokémon. Printable characters in Pokémon Blue's text encoding
25:01
start at 128, sort of the opposite of ASCII. So no matter what your name was, the characters that would end up in the level spots for the encounter table would be higher than you're supposed to be able to reach in the game. They would always be higher than 100. You could also get some glitched trainer battles this way, but those would only appear
25:21
if you had punctuation in your name, so most people were unaware of that. I certainly had no clue that that was a thing until I started doing research for this talk. Now you might be asking, if the encounter table was based on your name, why could everybody do this glitch? Surely it would be possible to have a name that didn't map to Missingno at all.
25:42
This is sort of true. It was possible to have a name that didn't include Missingno. But even if that was the case, you could still get 128 Rare Candies. And it was pretty unlikely that you would have a name that didn't include Missingno for a few reasons. The control character used for the end of your name
26:01
was stored as 80 in decimal, which is one of the ideas of Missingno. So if your name was an even number of characters, you could always encounter a Missingno. And a lot of players didn't even pick their own name, they just used one of the preset ones the game offered you. By pure luck, every single one of these names
26:21
has the right characters for a Missingno encounter. Except it wasn't even luck, because the Missingno characters were really common. They included uppercase S, H, and M, and most lowercase vowels. So the odds of you having one of these in the right place were really high. But even then there was a catch-all.
26:42
Every custom name could at least encounter Missingno's sister Pokémon, Tic M. And we call it Tic M because those are the only characters in its name that you can actually say. Now even though they have the same sprite, Tic M is different. As you can probably tell from the weirder characters in its name and its decision not to wear a mask,
27:03
everything about Tic M is garbage. The graphics that appear in its name are going to be based on things like your party stats or your position on the map. Tic M is what you get for internal ID 0, so you're going to get garbage even for data
27:20
that isn't in Pokédex order. Even when it's looking up by internal ID, it's going to underflow and read past the end of whatever data it's trying to read. Now Tic M had some interesting differences from Missingno. Its cry being garbage data would randomly change based on what screen you're on. It could evolve into Kangaskhan, so I guess this is what a baby Kangaskhan looks like.
27:45
You could also lock up your game by catching it. But if your goal was just to get 128 rare candies, it didn't matter if you saw a Missingno or Tic M. So now let's talk about why the sixth item in your inventory gets duplicated. This has to do with what happens
28:01
after you encounter a Pokémon. This all comes back to that Pokédex that we mentioned earlier. Its function in the game is to keep track of every Pokémon you've seen or caught. Any Pokémon that appears on this list is one that you've seen before, and the little ball icon next to its name
28:20
means it's been caught. This is stored in memory as a bitmap, one bit per Pokémon. It's sort of like an array of booleans, but instead of each entry in the array taking up one byte, it takes up one bit. So one byte represents eight entries in the array. Now, as you might have guessed, this array is stored in Pokédex order.
28:41
So when you encounter Missingno, it tries to mark that you've encountered a hypothetical 256th Pokémon. But since there are only 151 Pokémon in the game, this ends up writing way past the space used for this. The bitmap has to be rounded up to a number that's divisible by eight since it has to fit in a byte.
29:01
So what ends up happening here is there is 152 bits for real Pokémon, and where it ends up trying to write is the high bit of the 13th byte after the end of your Pokédex. Now, the inventory is what's stored immediately after your Pokédex in RAM.
29:21
The inventory is stored as one byte for the number of items that you have. And then for each item, there's one byte for its ID and then one byte for its quantity. So that means the byte that it tries to write to corresponds with the quantity of the sixth item in your inventory.
29:40
And another way of saying it sets the high bit of the quantity of the sixth item in your inventory is it adds 128 of that item as long as you had less than 128 before. Now, one other side effect of encountering Missingno is if you had beaten the game when you performed the glitch, you'd notice that the place where it stored the team
30:01
that you used to beat the game was now corrupted. And this is caused by Missingno's sprite. Remember when I pointed out that the pause at the start of the fight was abnormally long? This corruption is why that happens. Due to the amount of space they needed to decompress these sprites, they can't do it on the console's RAM.
30:21
So instead, they do it on the cartridge's persistent storage. The space they use for this is large enough for a seven by seven sprite, which is the largest that appears in the game. But the data that represents Missingno's sprite says that it's 13 by 13. So they write way past the end of that buffer, and the next thing on the cartridge's storage
30:42
is the Hall of Fame. But because Missingno's sprite was read from ROM, not RAM, that means that the sprite data never changed, and so everybody who did this glitch would see the same corrupted Hall of Fame. Although you wouldn't know it immediately because some of the names of the Pokémon would include things like the control character
31:01
for printing your rival's name. So a lot of people would see an Omanyte named Gary because Gary was a really common rival name, but it would change a little bit. Now, this bug would have been avoided if there was some bounds checking in the sprite decompression code. Like I mentioned before,
31:21
everything in this game was optimized for code size. If you're only dealing with a known set of trusted inputs, omitting these checks seems perfectly reasonable. When the code received real sprites, it always behaved perfectly. The only reason this code misbehaved was because of a completely unrelated bug
31:41
that caused it to get garbage data. Now, those are the only two abnormal effects of encountering a Missingno compared to encountering any other Pokémon. Remember that the main way this glitch spread was through word of mouth. That means that there were a lot of untrue or half-true rumors that spread around,
32:02
and I'd like to debunk a few of those. The biggest piece of misinformation you might have heard is, don't catch Missingno or it'll corrupt your save. And this is just straight up false. There's no ill effects of catching Missingno, and there's really nothing about it that can't be saved normally. I think the source of this misinformation
32:20
was a very specific problem that can arise with Tick M. In the games, you can bring up to six Pokémon with you in your party. And if you catch another one when your party is full, it gets sent to a storage system. And when you open up this storage system later, the game has to recompute the stats for all of those stored Pokémon. And there's a bug in this calculation
32:40
where if it tries to compute them for a level zero Pokémon, it gets into an infinite loop. Now, you're never supposed to be able to encounter a level zero Pokémon. But if you did this glitch with a custom name, you could always encounter a level zero Tick M. It would always occupy the bottom two spots in that encounter table. And since at the point you did this glitch,
33:03
you probably had six Pokémon in your party. That means it probably went to storage. So I think this was the source of that rumor. Another thing you might have heard is that catching Missingno would cause all sorts of graphical glitches. Nintendo even put out a statement saying to try releasing it to fix the scrambled graphics.
33:22
And if that doesn't work, you need to restart your game. And just all of this is nonsense. There's a specific mirroring effect you can cause if you view the stats screen for Missingno. What happens here is on the stats screen, the sprite for the Pokémon is displayed mirrored. And there's a byte that says,
33:41
just render the sprite mirrored. And for whatever reason, when you view the screen for Missingno, it doesn't set that byte back to zero afterwards. But only front-facing sprites are supposed to be rendered mirrored. So anytime it's a sprite that represents something's back, it gets this weird jagged effect. But this would go away if you viewed the stats screen for any other Pokémon
34:03
because it would set that byte back to zero correctly. There were some bigger glitches if you had Missingno in the follow-up game, Pokémon in Yellow. But in that game, they also fixed the bug that let you encounter Missingno in the first place. So I don't think that's the source of this.
34:21
Finally, encountering Missingno wouldn't save your game. This is a really weird rumor, and I'm surprised it even got started because it's so easy to verify as false. You just do a Missingno encounter and reset without saving and see, yeah, no, that did not save. I think the source of this one is an N64 game called Pokémon Stadium.
34:40
It included an emulator and let you play the first two generations of Pokémon games. And whenever the cartridge's storage was written to in that emulator, it would display the word saved on screen. So when that buffer overrun happened, that corrupted the Hall of Fame, that would cause the N64's emulator to display saved on screen.
35:00
And I think that's where this one came from. So now we've seen every piece of this glitch. We can see that it was just a bunch of small, seemingly benign interactions between unrelated bits of code. No individual piece of this glitch stands out to me as insane or something that obviously would have been stopped in code review.
35:24
When you combine all this together, you get one of the most famous glitches of all time. But it's not the result of some horrendously bad coding or lack of QA or any of the other things you might hear people say about this game. Every piece of this glitch, by itself, was relatively benign.
35:43
Or just due to completely unrelated parts of the code interacting in ways that nobody would have expected. And this was handwritten in assembly under massive space constraints. Every instruction mattered. I certainly don't think I would have done
36:01
any better than they did. And I don't think anybody watching this would have either. A phrase that I've heard from folks making fun of the glitches in this game is completely broken. And I think we should just remove that from our vocabularies entirely. In this case and many others where you would try and use that terminology,
36:20
it's more likely the software is developed under some constraints that you weren't aware of. And you wouldn't do better in the same circumstances. Sure, these days they're less likely to be technological constraints, but every single one of us worked on a project where two days before the deadline, the requirements change out from underneath you. Or your company suddenly pivots and now you do medical services
36:41
and you have to figure out how to make a bunch of code relevant for that. To me, though, a lot of this glitch just boiled down to because assembly. It's really easy for us to take the technologies we have at our disposal today for granted. Today, code size is rarely a hard constraint.
37:02
You're unlikely to ever work on a project where this binary has to be 27K or smaller or we can't ship at all. When code size matters today, it's usually because of CPU caches and it's a thing we find while optimizing our code. And we run our code on machines powerful enough to just include all sorts of safety checks
37:21
and never give it a second thought. But in 1996, just use Rust wasn't an option. And even using C wasn't an option. I'm really glad that we don't live in that world anymore. There's a really high-quality disassembly of the game available, which I used to research this talk, called Poker Red.
37:40
It doesn't have the comments in it that the real source code would have, but this team took the machine code from the game, disassembled it, and went from there and figured out where all of the labels would have been and where they would have used macros and turned it into something that resembles real source code somebody would have written. It's an amazing project, and it was invaluable for preparing this talk,
38:01
so a huge shout-out to the team who worked on that. I also want to shout-out the organizers. This has been a great conference, and doing a virtual conference like this is a lot of work. I want to give a special shout-out to Nel Shamrel, who led the program committee and made it a point to do multiple run-throughs of every talk with the speakers before they presented them.
38:22
That means that you got a much higher-quality conference than you would have otherwise, so thank you so much. Finally, this talk was co-authored with my partner Tess, and she also helped me with a ton of the slides. Tess, I love you. Thank you so much for working on this talk with me. It was a blast. If you have any questions,
38:41
I will be in the Discord immediately after this. If you're watching this in Europe live, go to bed. It's like 3 a.m. What are you doing? If you're watching a recording and you want to ask me a question, here's my contact info. Feel free to reach out, and I'm happy to answer any questions you might have. Thank you so much for watching, and bye.
Recommendations
Series of 10 media