Memory dumps can be a very powerful tool for application developers because it provides detailed information about the current application state at when an error has just occurred. I’m sure most of us have experience working with memory dumps to solve one very specific type of problem: finding the cause of OutOfMemoryErrors. Why then, if memory dumps are doing such a good job at helping us fix memory issues, are they not used to analyze and fix other issues? The answer to this is twofold. Firstly, the available tools are largely geared towards solving Out of Memory situations. Secondly, most memory dumps are so big it only makes sense to collect them from a device that you have physical access to. It’s the second of these two problems I’m offering a solution for. By making memory dumps small enough to be uploaded from a device, they can become viable tools for debugging more types of issue. The first step towards the goal was to understand the strengths and weaknesses of the commonly used HPROF format. For me this process started while implementing a HPROF library used in a project to deobfuscate ProGuard obfuscated dumps. My starting point was to extend the existing HPROF format with new functionality but I soon came to the conclusion that if I wanted to create something better I needed to start from the ground up, discarding legacy compatibility. When designing the format my main inspiration came from binary protocol buffers. A format that is both efficient and flexible. Still, I did not want to use a complete protocol buffer implementation but rather cherry-pick the best concepts from the format: - Variable length encoding for integers and floating point values - Repeated fields without the need to specify the length of the vector At the same time, I also wanted to avoid some of the complexities of using protocol buffers, such as: - Record size encoded into the header - Support for handling unknown records The end result of applying these and other improvements, is a file format which achieves a size reduction of up to 97% compared to standard HPROF files. Nonetheless, it still contains plenty of relevant data for a developer to analyze. This improvement brings memory dumps down to a size where they can evolve from being a one trick pony into an indispensable tool for analyzing many types of issues. |