We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

GBForth: Using Forth to understand the Game Boy

00:00

Formal Metadata

Title
GBForth: Using Forth to understand the Game Boy
Title of Series
Number of Parts
561
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
During this talk we'll get a good understanding of Game Boy programming by reverse-engineering a ROM using Forth. We go beyond just decompiling the ROM to assembly and show how we created a cross-compiler that allows writing Game Boy games in Forth as well. You'll get to see how Forth interacts with the Game Boy hardware, and how the language can be extended to easily render sprites or play sounds for example. We show you how to use Forth to incrementally refactor Game Boy bytecode into higher levels of abstraction. No black boxes. No magic. This way you can understand and appreciate every layer of the hardware and CPU instructions one by one. A similar approach can help you understand other systems (NES comes to mind) and create a language that is more comfortable than ASM or C to work with. The talk is accessible to developers without former Game Boy, Forth or compiler experience. Topics covered: - Go through the basics of Game Boy hardware - Explain how rendering graphics works on a Game Boy - Outline the challenges of working with the Game Boy memory - Show how to reverse-engineer a binary using Forth - Describe the process of writing the cross-compiler - Talk about using GBForth to write Game Boy games
Game theoryHacker (term)EmulatorCompilerPascal's triangleSource codeBefehlsprozessorComputer hardwareVideoconferencingSquare numberTouchscreenFeedbackReverse engineeringBinary fileAbstractionRight angleGame theoryBitComputer programmingHacker (term)EmulatorForcing (mathematics)Line (geometry)Field (computer science)CompilerMultiplication signGroup actionMultiplicationCycle (graph theory)Information technology consultingMixed realityPointer (computer programming)Moment (mathematics)DebuggerSpacetimeQuicksortFeedbackSemiconductor memoryComputer architectureDampingEinsteckmodulComputer hardwareProduct (business)Physical systemVideoconferencingPoint (geometry)TouchscreenBefehlsprozessorReverse engineeringProjective planeCodeFunctional (mathematics)Address spaceComputer simulationSkewnessData structureCoefficient of determinationWavePersonal digital assistantBeat (acoustics)Cache (computing)outputCommunications protocolComputer animation
Game theoryReverse engineeringBinary fileCompilerAbstractionStack (abstract data type)Parameter (computer programming)BitCompilerFormal languageTotal S.A.Forcing (mathematics)Multiplication signXMLComputer animation
Stack (abstract data type)Parameter (computer programming)FactorizationFormal languageComputer virusMachine codeNintendo Co. Ltd.Formal languageCodeNumberWikiCompilerInternetworkingComputer programmingGame theoryStack (abstract data type)WordData structureRevision controlSequenceBitCategory of beingSpacetimeMetadataGraph coloringSubsetBefehlsprozessorDomain-specific languageParameter (computer programming)Computer fileNormal (geometry)TouchscreenHash functionReverse engineeringFlagQuicksortResultantCore dumpRaw image formatMereologyCASE <Informatik>HexagonGreatest elementPoint (geometry)CodeGroup actionProcess (computing)Multiplication signNatural numberProjective planeField (computer science)InformationEmailPlastikkarteInheritance (object-oriented programming)Element (mathematics)
Assembly languageDivisorMachine codeExtension (kinesiology)Formal languagePattern languagePattern languageWordArithmetic meanNumberCodePhysical systemArrow of timePoint (geometry)QuicksortMachine codeDoubling the cubeOpcodeConnected spaceHidden Markov modelComputer animation
Pattern languageFormal languageExtension (kinesiology)Complete metric spaceAssembly languageMacro (computer science)MereologyOpcodePoint (geometry)Positional notationBefehlsprozessorFormal languageComputer programmingAssembly languageCycle (graph theory)Game theoryData structureAdditionCodeBitTrailWordTable (information)SequenceFreewareComputer animation
Macro (computer science)Point (geometry)Assembly languageComputer fileBitComputer programmingComputer animationXML
Macro (computer science)Abstract state machinesRepresentation (politics)Assembly languageRevision controlMacro (computer science)Game theoryComputer programmingProjective planePattern languageData storage deviceCodeLibrary (computing)BitIntermediate languageImplementationComputer animation
Macro (computer science)Abstract state machinesRepresentation (politics)Intermediate languageSystem callHidden Markov modelCodeMathematical optimizationComputer animation
Abstract state machinesImplementationCodeCompilerPoint (geometry)WordGame theoryLevel (video gaming)CodePrimitive (album)MultilaterationAssembly languageComputer animation
LengthAssembly languageSemiconductor memoryComputer programmingAddress spaceBlock (periodic table)Level (video gaming)WordVideoconferencingLimit (category theory)Logical constantFunction (mathematics)ImplementationComputer animation
Game theoryDivision (mathematics)SpeicheradresseSemiconductor memoryLimit (category theory)Key (cryptography)Game theoryRead-only memoryDivision (mathematics)Run time (program lifecycle phase)Keyboard shortcutWordVariable (mathematics)ComputerUltraviolet photoelectron spectroscopyMereologyPhysical systemString (computer science)EinsteckmoduloutputReading (process)Computer animation
Game theoryGame theoryMereologyComputer programmingPhysical systemPoint (geometry)Roundness (object)XML
Game theoryAbstract state machinesCompilerMathematical optimizationSemiconductor memoryGame controllerWritingGame theoryEmulatorMathematicsStandard deviationTesselationSoftware bugCompilerRadical (chemistry)Fitness functionComputer programmingEinsteckmodulImplementationAssembly languagePrimitive (album)Level (video gaming)CodeFunctional (mathematics)Exception handlingVariable (mathematics)WhiteboardQuicksortComplete metric space
Abstract state machinesMathematical optimizationCompilerSemiconductor memoryGame controllerWritingGame theoryRange (statistics)Game theorySoftware bugMathematical optimizationCompilerComputer programmingComputer animation
Abstract state machinesCompilerMathematical optimizationGame controllerSemiconductor memoryWritingGame theoryProjective planeCondition numberGame theoryGraph coloringDampingSheaf (mathematics)Different (Kate Ryan album)EinsteckmodulRecursionCategory of beingDebuggerAbstractionSemiconductor memorySpacetimeComputer hardwareGame controllerLibrary (computing)Primitive (album)Address spaceCodeMultiplication signWordMathematical optimizationDecision tree learningLine (geometry)Computer animation
Standard deviationCycle (graph theory)Computer programmingWordoutputDecision theoryDoubling the cubeHacker (term)WhiteboardParticle systemSpherical capMathematical optimizationMultiplication signProjective planePointer (computer programming)1 (number)Point (geometry)DampingAssembly languageMixed realityDevice driverDifferent (Kate Ryan album)EinsteckmodulCompilerOffice suiteRadical (chemistry)EmulatorKey (cryptography)Computer fileData managementMereologyGame theoryComputer hardwareCalculationFloating pointCASE <Informatik>Software testingPlastikkarteBuildingString (computer science)Selectivity (electronic)Reverse engineeringSummierbarkeitPosition operatorStack (abstract data type)Limit (category theory)Virtual machineEmailNumberFlash memoryContext awarenessMappingComputer animation
Point cloudCanonical ensembleCompilerProjective planeMathematical optimizationLecture/ConferenceMeeting/InterviewComputer animation
Transcript: English(auto-generated)
All right, welcome everyone. So today we're going to talk a little bit about Game Boy programming, or actually discovering how Game Boys work using Forth.
First, introduce ourselves. This is my colleague David Einstein. We are working at a consultancy company in Amsterdam, Reactor. And our main work consists at the moment of JavaScript programming, not always the most challenging stuff. So we decided to start a sort of after-hours hacking group
called the Amsterdam Hackers with a few other colleagues. So for the first project we were thinking what is interesting to do, something novel, hopefully, and something that we can learn something from. And we started playing around with writing some emulators.
I started writing a chip-8 emulator, which is the most obvious place to start, I believe. And we were thinking, how can we make this into a bigger project? So we were thinking maybe we could do Game Boy emulators, but that's quite a saturated field already, and I don't think we would be able to compete with some of them out there.
A Game Boy game, also done quite often, and eventually we settled on writing a compiler for the original Game Boy.
Okay, so as Stein said, the Game Boy has been studied quite a lot already, so you can find plenty of manuals, tutorials about how to write games, how to use assembly, and everything. But the idea of the project was a little bit on the key points to try not to use any tooling.
We need the manuals. We cannot really write the compiler without the documentation on the hardware, but we try to avoid using emulators or debuggers or anything as much as possible, although at some points it was useful, but it should be something that shouldn't be completely necessary. Okay, so let's talk a little bit about the Game Boy architecture.
The CPU is a mix between the AT-AT and the Z-AT CPUs, but it's custom. The internal clock runs at 4 MHz, but every instruction is actually like multiple of four cycles. So in practice, the CPU is like 1 MHz.
There are like 32 KB of addressable from memory, but only 4 KB are working memory that you can actually use. There are another 4 KB, but that actually is hosted in the cartridge, and depends on the functionality of the cartridge that you're using. So there are eight registers, A to F.
Some of them you can actually combine them and use it in like 16 bits way, but they're actually all at 8 bits, so the product counter and the stack pointer. Then we have like the video hardware,
similar to like the Atari talk that we saw this morning. You have like a scan line that goes left to right and top to bottom. You will have some space and time between each line that is like the horizontal blank and between the end line and the top line, the vertical blank.
The interesting thing is that the video memory is actually unusable when the hardware is actually drawing. So you have to synchronize your code with the hardware to only write to the video memory in the vertical or horizontal blank. So this is quite tricky to get right,
and what we saw is like if we start writing the compiler, it's going to be really difficult to get any feedback. So you have to get pretty far just to get something on the screen. Additionally, there are some other systems like sound subsistence, you have some input, timers,
but these are like something that we can work on later. Initially, we focus on the video. So how do we start it? We read the manual, but as I said, it's not easy to write the compiler from scratch.
The feedback is going to be pretty long. It's not incremental, and we will probably need some debuggers or existing toolings to actually assist with this. So instead, we decided to start with a working game that, as time will show you later, was as a hello world,
the simplest ROM that you could find. We're going to reverse engineer this binary using force into a readable program that you can actually modify later on. Then on top of that, we will write the compiler. Time will show you the steps more carefully later. But let's explain a little bit, force.
Has actually anybody in the audience used force before? All right, nice. So force is basically the simplest language that you can imagine, at least for me. There is no syntax. Your code consists of space separated words. That's all. You have numbers, and you have other words.
Numbers will push themselves to a stack, and then the words will execute an action, usually taking parameters from the stack and pushing the result back to the stack. You can, of course, then define your own words. So in this example, we add five.
Well, five is pushed to the stack. We push one to the stack. Plus will pop two arguments from the stack, and dot will pop the result of plus and print it in the screen. The interesting thing here is you can take just one plus or any subsequence of words and extract it into your own definition.
So we define here increment as one plus. So this property, the ability to extract any subset of words into your own definition makes the language concatenative, and that's a property that's going to be very, very useful to reverse engineer the problem, as Stan will show you.
Additionally, if you pick carefully the words that are going to make your language, you can define domain-specific languages to model what you are trying to do. And Stan will show you as well a very cool example in the Assemblage.
So I will let Stan explain now how this works in practice. Hopefully. Okay, so we set up on the language that we would use for this project, and we looked over on the Internet for the most simple case of a ROM that we could find.
Just a ROM that prints hello world to the screen and doesn't do anything else, like just stops there. This was already something that we had no clue like how to make. So we just started from the ROM.
This is the part of the hex dump of the code. This was obviously completely meaningless to us at this point, but we knew that this was a working ROM, and we took actually to make sure that we would keep this ROM intact. Over the course of the process, we would keep like a hash of the file somewhere
and verify that like every step we took, like we are still compiling the same ROM. So we changed this file, pretty much just adding the word C comma in between every byte. So C comma is a word that takes the argument from the stack,
so pretty much like the value that comes before it, and writes it to a file, a ROM basically. So in a way, this was our first like version of the compiler that would only produce like a hello world ROM, but it's like executing and like producing something that actually runs on a Game Boy. Actually, at this point, we didn't try to run that,
but still like kind of like a meaningless sequence of numbers. So we went to look for offsets of where certain data is stored. The Game Boy documentation and like other like wikis online that we could find, like they show a quite distinct part of the cartridge,
which is the header, which contains some meta information like the title of the ROM, like a checksum to make sure that like all the bits are intact, manufacturer information, et cetera. And we identified like the offsets of this, and we can see here like I think the Nintendo logo on top,
and on the bottom we have the example title for our ROM, and we can simply extract this into a new word. So now we have a logo and a title word that still do the same thing, and due to the concatenative nature, you can simply just replace those sequences with a new word,
and we get something that's a little bit more readable if you imagine these like definitions to be abstracted away somewhere else. Still producing exactly the same ROM, but we can slowly like start seeing a structure in the program. Of course, like we keep repeating this for the rest of the program as well.
At some point we have all the sort of like raw data extracted, so some like flags that indicate if it's a color Game Boy game or a normal Game Boy game, like the title fields, et cetera. Then we are left with some bytes that we haven't like translated yet, and that's pretty much the program itself,
because of course like there's no documentation that will tell you what the actual code will look like. So for this we reference the CPU manual, and we'll find like certain numbers are like certain opcodes, and here's an example of the hexadecimal value 3C
pretty much translating to an increment A, so the A register, and a 04 translating to like a increment B. So first things we can of course replace all these like numbers with a new word that emits that number, and that will work fairly well,
but the nice thing is that we can find again like patterns in these, like in between the machine codes as well. So what you can do is abstract away the operands of an action, so the increments we can define separately, and basically combine it with an A and a B register as shown here.
There's not really like a meaning to the binary values, it's just like if you combine these like numbers in this way, it will end up becoming an increment A or an increment B. And basically to use it we can basically just run A increments, and it will emit an increment A like code to the ROM.
So of course like we can continue doing this, and at some points we'll find like more, let's say like patterns in between the opcodes, and we can start to create a sort of a DSL
or like a pattern matching like system on top of fourth. So we define here the like arrow words and the double colon words, and what it allows us to do is like basically write a complete assembler, and if you would look at the documentation of the CPU,
you would find pretty much the same table that you see in code here described in the manual as well. So we have like an increment instruction, and for a normal register we have a certain like opcodes, for the specific HL register, an opcode, etc.
I think that's everything we can say. There's some nice additions here that we also like keep track of the cycles. I don't think we use them, but like you can extend that as much as you want, and kind of like create your own language structure. So at this point it's really just a matter of
kind of a boring process of like looking up the opcode and translating it to the correct assembly instruction, and eventually we end up with a like program that looks like this. Again, this is the same program as the hexadecimal code that I showed before, or at least like a part of it.
So we have like a full assembler written in fourth with like a postfix notation, and you can use this pretty much to like create Game Boy games already. I think like a lot of Game Boy games were actually created like directly in assembly, so yeah. With the nice advantage of that, it's still running in fourth, so you could get like word definitions for free.
Whenever you see a sequence of assembly instructions, you can abstract them into a new definition that has a little bit more meaning to you, and we can continue that with more until we have like almost like barely any assembler visible anymore,
and like a bit more descriptive program. So at this point we're still checking that we're producing exactly the same file, so we're quite sure that everything is correct. Of course like we could have some typos and instructions that are not used, but that has been fixed later. So now we actually have a program that we can edit
without having to understand what offset we have to change what bytes, and we can produce like a new version of our Hello World ROM with like arbitrary text. Of course like this is not where the story ends, because this is just an assembler in the end.
We want to take it a bit further and implement fourth completely for the Game Boy. So ideally you wouldn't want to have to like work with assembler at all in a project. So there's like two ways of doing this. One, the kind of the most like logical way maybe would be to keep extracting like patterns in your assembly code
and go into macros and kind of like build some libraries on top of that to do various things, even like playing sound you could just do with macros. And you can kind of like program your whole game or program in like a macro language
basically on top of your assembly. Of course like we didn't do that, we went for the hard way, which is to actually parse like every definition that we do that you create ourselves, and kind of store all of the code that you want to write in an intermediate representation.
This is a bit more complex because you're not directly emitting assembly anymore, but it allows you to do things like lazy emitting. So you can create like a million definitions and if you don't use them in your like main codes, we won't emit them to the ROM, which is quite useful with the 32 kilobytes that you usually have.
Other things are optimizations. We have worked on tail call optimizations in this. There's like a lot of other things we can do using like intermediate representation. So we basically like started working on this.
The idea is that at some point we redefine the column work that usually like creates a new definition into something that collects the definitions that you wrote for a later compilation to the Game Boy. And slowly we added code primitives in assembler. So basically defining what does swap do, what does plus do, and so forth.
So eventually like we have all the primitives to have something working, we can add on top of that higher level definitions. So basically using existing fourth primitives to create new fourth primitives almost.
And slowly but surely we are translating the entire ROM to fourth and not using like assembler at all anymore, unless you may want to optimize something later. So this is more or less like what I think the same program looks like. It's not binary the same anymore, but it will produce the same hello world or hello FOS of them output.
You can see like there's no assembler being used anymore. We have some like quite high level words defined. You can see on the top some constants for like addresses that are like common to use.
Words like see, move, video that just like move a block of memory into your video memory. Kind of like according to the standard fourth implementation if you're familiar with it. So there are a couple of limitations in our approach.
Well, I guess not our approach, but stuff that we just didn't do. If you're used to fourth, like you might know that you're able to redefine words at runtime, that's not possible in our system, mainly because there's no keyboard inputs. So we didn't really know how to like handle that.
Plus like words are not stored as like in fourth, it's like a string in the Game Boy. We eventually just compiled them to like memory addresses. So it's quite a, I guess a quite static like system once it's compiled. Apart from that, we ran into some issues with the division between the read only memory and the random access memory.
On a computer, this is like pretty much like all RAM of course. On the Game Boy, you have to take into account that data needs to be written to the cartridge, which is read only. And once you want to modify variables, you'll need to explicitly copy them over to the other part of the memory,
which is also quite limited. And like part of it is used for the stack that we use in fourth as well. Okay, so we implemented a basic fourth system that should be able to compile a fourth program into a Game Boy ROM.
Of course, like until this point, I think we only tried to write like Hello World examples, nothing too fancy. Maybe we did some like moving around of stuff, but the basics were like, I mean, everything we did was quite basic. So we decided to find a third part of the fourth program.
And this is Soccer Bomb. It's quite old. It's included in the G4 implementation. And it's, well, Soccer Bomb, if you're familiar with it. Quite simple game. So the goal with this is like to try and compile this into a Game Boy ROM.
This doesn't work because like we had a lot of like primitives missing still, but the idea was to add primitives or functionality to the compiler until basically this program would like compile into an actual game. I think we barely had to like change anything except for the fact that variables cannot be initialized,
because again, like they have to be written to the ROM and you want them to be in the RAM. So there's like some initialization codes that needed to be added. But apart from that, we managed to get it working. I think like it's also like interesting to mention that the original fourth implementation is written for a terminal.
So rather than rewriting that, we created a terminal emulation layer on top of everything. So for, I think like for definitely for simple like tile based games that you could write with like ASCII Arts almost.
It's almost like trivial now to port terminal fourth games to Game Boy. I will admit that we had to scrap like half the levels because they didn't fit on the cartridge. So yeah. So in the end, like we ended up with a pretty complete fourth implementation.
I think we have some like tickets still open to make it like fully compliant to the end standards. But everything that's like missing now is there's a lot of like bugs in the assembler of the original Game Boy. If you've done anything like Game Boy programming, you might have heard of the increment bug or increment sprite bug,
where incrementing a register in a certain range will result in the sprites being completely messed up. I think like not even restorable. So these are the things that the compiler could kind of like automatically remove and refactor into something that will work.
There's a lot of like room for optimizations. I mentioned like tail recursion. We ended up not being able to include it because I know I think there were some like edge conditions that we didn't like think about yet. But there's other stuff we can do, inlining of primitives.
Like there are certain words that are like used all the time. And right now like they're just calls to another address, which is like reasonably expensive. If they can be inlined, of course it will be a lot faster, especially for like code that's like running in the V blank sections. There's some peephole optimizations, like with all these abstractions you will get pairs
of like assembly instructions that are basically in NOP when they're run together. Those can be removed to save some space and time. We don't have like Game Boy Color support. Game Boy Color is like quite similar to the original Game Boy. The main difference is color.
So there's like some extra registers that handle the palette data for the color. And those things are, I mean they're accessible through assembly, but of course that's not the goal of this project. Then there's like memory bank controllers. David mentioned already that like the cartridge often contains a lot of hardware.
That hardware is like managed by a chip in the cartridge. And you can kind of like write to the ROM addresses to access certain properties. There's like rumble cartridges that like vibrate the device. Just like the camera is a kind of a cartridge.
So those can be like still implemented in libraries. We have like features like ROM bank switching. Basically you cannot access the entire ROM at the same time in certain cartridges. So you have to like bank them. More debugging tools. And actually like we still have to write a tutorial of like how to write a fourth game from scratch.
And contributions are welcome if you want to take a look. Anything? So thank you for your time.
So the project is at amsamhacker.com on GitHub. We can also recommend these two talks. One called the ultimate Game Boy talk if you want to know more about, well I guess Game Boys because it covers pretty much everything. And it's a great talk about reverse engineering the hardware of the Game Boy.
Where actually chips are being, how they call it, uncapped to check the internals. I think the idea is to create also like a cycle, cycle accurate emulator.
I have a question. To replace the text from hello world to hello foster decision. What did you do about the headers and the checksums and all these things that are the beginning and the end of the ROM file? How did you understand what happened here? So the question is like how we managed to like go from replacing a string basically to replacing the whole file.
So like I think the main point is you mentioned also like checksums and deeds like I skipped over that. There are some like checksum calculations going on. Those are documented how they work.
Yes, like checksums for sure. And the rest is like just like decompiling to assembler and like trying to abstract it higher and higher. That was all there is to it. Yeah. So when translating soccer bat into with resort, how do you do for the mapping of the input actually?
I imagine that's how my input is. Yeah. So that's part of like the terminal emulation actually. So we replace like the word to actually get one key from the input to just go and look into the keypad driver.
So the question is how do we map like the registers in hardware to like the different pointers that you need to force.
Basically that's a convention. So we've reserved one of the registers for the double stack. That's a quite common optimization. Then we have one of the registers point to the double stack. And then we have some scratch register that you can use and change. So the compiler will always stick to the convention. It's good also because it means that you can actually mix assembly and forth if you know them.
Question in the context of an 8-bit machine with limited RAM and speed. Do you have an opinion, a strong opinion against or in favor of full compliance with the ANS kernel?
I think like we reached a conclusion that, sorry let me repeat. So like what the value is of like complying fully to these standards. I think like if most of the programs work well enough I kind of feel that there's not really a need to be fully compliant. Apart from that there's a lot of instruct, like a lot of words in the standard 4 that don't make sense on the Game Boy.
Like stuff that deals with the OS obviously but also like floating point numbers we're not like emulating. Double precision arithmetic? Yes, double precision is like, so those words we skipped and other ones are quite easy to add yourself. So I think like as long as we support like the common cases it's like, if it's usable I think it's like sufficient.
So the question is like how we test like ROMs.
There are indeed like special cartridges that use like SD cards to load to ROM. So they have like a little OS built in as a file manager. Or you can use like cartridge flashers, you can buy them online for more like persistent like flashing.
How long did it take us like this? That was a good one. So how long did it take? I think we managed to get to the assembly point in two months. And this was like with like two nights a week like after like office hours like working on it.
And I think like 80% of the project in three months. Like small optimizations later but yeah, sometimes three months I would say. Alright, thank you everyone.