Video in TIB AV-Portal: JavaJournal

Formal Metadata

Title of Series
Part Number
Number of Parts
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Despite the multitude of Java decompilers available, we often have the need to debug or trace malicious or obfuscated Java bytecode. Existing Java debuggers and tracers are mostly targeted towards Java developers, are closed-source, and are not meant to handle malicious or obfuscated targets. We present a new open-source cross-platform framework for debugging Java, written completely in Python, designed specifically for reverse engineering. We also present a Java method call tracer as a sample Python application that utilizes this framework.
Process (computing) Computer animation Resultant
Computer program Group action Presentation of a group Dynamical system Java applet Source code Workstation <Musikinstrument> Water vapor Client (computing) Fluid statics Different (Kate Ryan album) Computer configuration Encryption Automation Error message Information security God Physical system Exception handling Enterprise architecture Arm Block (periodic table) Cross-platform Electronic mailing list Stress (mechanics) Sampling (statistics) Shared memory Arithmetic mean Process (computing) Figurate number Writing Web page Bytecode Product (business) Number Goodness of fit Term (mathematics) Bridging (networking) Profil (magazine) Authorization Energy level Compilation album Address space Standard deviation Scaling (geometry) Graph (mathematics) Information Key (cryptography) Demo (music) Interface (computing) Weight Line (geometry) System call Compiler Loop (music) Software Network topology Family Window Principal ideal Building Run time (program lifecycle phase) Code Multiplication sign Set (mathematics) Function (mathematics) Parameter (computer programming) Mereology Malware Semiconductor memory Modul <Datentyp> Software framework Endliche Modelltheorie Extension (kinesiology) Area Scripting language Programming language Wrapper (data mining) Moment (mathematics) 3 (number) Price index Entire function Type theory Right angle Remote procedure call Reverse engineering Functional (mathematics) Data recovery Heat transfer Theory Connectionism Revision control String (computer science) Gender Projective plane Debugger Mathematical analysis Subject indexing Computer animation Communications protocol Local ring
Touchscreen Computer animation Demo (music) Different (Kate Ryan album) Source code
Spring (hydrology) Computer animation Multiplication sign
Web crawler Functional (mathematics) Java applet Code Range (statistics) Letterpress printing Branch (computer science) Function (mathematics) Streaming media String (computer science) Network socket Encryption Social class Graphics tablet Sampling (statistics) Wellenwiderstand <Strömungsmechanik> System call Computer animation Integrated development environment Symmetry (physics) output Quicksort Family Window Reverse engineering
Computer program Computer file Open source Java applet Code Multiplication sign Parameter (computer programming) Function (mathematics) Theory Product (business) Computer configuration String (computer science) Cuboid Software framework Hydraulic jump Alpha (investment) Oracle Installation art Addition Sampling (statistics) Virtualization Line (geometry) Cartesian coordinate system System call Type theory Word Process (computing) Computer animation Integrated development environment output Right angle Object (grammar) Abstraction
. death in the and then you have to do not have a hand in the end be the the a little alright so the next
step we have to get there across right so it has this been discovered venom you will be giving a talk about job a journal and as results the the thanks introduction everybody
get the sale of familiar faces out there remains Jason governor among a principal security researcher crowd strike and yet today on your talk with you about 2 projects that were releasing 1 called the Java journal and 1 called pi espresso that's your prior guess from the name of the talk job Journal prior thinking right now she's some people still use job but really it on my what believe they do job was actually a really really popular if we look at the at U. B. programming community index and I pull this graph from lecture just a few days ago it's UV programming community index is an indicator of the popularity of programming languages and we can see in that light blue is Java we can see an actual resurgence in Java over the past few years and we're seeing it as the medical has overtaken seed in terms of popularity and in fact the most common programming language that we're seeing these days but it's not just authors of legitimate software that using Java more and more now we're also seeing an uptick in the use of Java by all authors of malware especially cross-platform malware so the upper arms getting in job out there and you know for recovery Powell thinking that that's great news because dramedy decompilation is easy you worse the problem I have looked all the tools we have for decompilation if CFR reaffirmed flower JET do crack so proc Yun I mean always to do is take whatever we want to analyze a job a binary throw into NET compilers in industry the source code although I got tired 0 my god I actually forgot it it's pretty easy to obfuscate Java code so sometimes even having the compiler really isn't enough we're still gonna have problems when it comes to obfuscated Java so then this begs the question what have we been deal with obfuscated job and analyze it all there are 3 main approaches the 1st approach is to take the decompiled Java code and and try to recompile it and debug it's and as a want try this taking taking the output of a d compiled java program but it's an IDE introductory rebuild that some of you as the semantic see of laughter I think some of your laughing because you know that it almost never works and you're going on a whole bunch of errors from the compiler and even if you spend their hours to try to fix all those areas and you can gets compiled the chances of a running running correctly are very slim and so it's very difficult to together we compile and debug Ratsiraka cross that off our list and the next option is I. creating an idea obfuscator and seems like a good idea and you know you're uh if you're writing your own AD engine maybe it makes sense if you seen seem cation techniques used over and over again in a given us so the families of malware and but the truth is that the work required to do this doesn't really scale especially if you're an individual researcher author seem a different sets of cation used and the other thing to keep in mind when it comes to create the upper secure especially a static D obfuscator is that oftentimes the or enzyme d up a station that's done and will actually make use of runtime information that you can easily glean statically so for example uh you know it's common to say CA up a string decryption function being used in a job program and the decryption function uses for its decryption key the information from the run time for example might look at the call stack and from their use that calls like information generated decryption key to could the string so time it's pretty easy to figure out what the key would be because that information is evaluated dynamically but statically it's a iterative programs automatically of the strings for that type of of challenge which leaves us with dynamic tracing so yeah I know everyone here knows of tools like Process Monitor and industries and really useful for capturing high level information for the resource of the system resources that used by a given process and but often times is that level of information is just too high level to be able to get a good understanding of what a program is doing and such as a permanent job and therefore say a native Windows programs written in C + + and what not and we can use tools like really tabs BPI monitor things move a surprise heard that a really nice interface we can see every API call that's maybe the from 0 to a Windows deal at that function or a a 3rd party deals export function and we can see all the information entire culturais it would be really nice to have that type of information for Java so if we then keep going with this thought what would an ideal tracers look like the Will the several things we want in an ideal tracer 1st of all we want to be lightweight ideally we don't have a bunch of third-party dependencies to deal with I would also like to be extensible we done in theory we be using some automation along with this we won't be able to have the ability to write code on top of it to build automated interface with end because chances are well you know for almost around the world except for be using some ulcers products and you'd want to be well documented see
connectionist and was doing and how to use it In every 2 and I personally don't like writing Java code allow the solutions out there for analyzing Java programs required the user the reverse engineer to actually write their own analysis scripts in Java I don't like that requirements let's say we don't wanna have to write in job of 31 to be cross platforms you want to work on the mean of gender processes running on Windows and Linux Mac OS even Android and number 4 but BY capture all of information right were doing the whole trees to model trees of all the method calls made so it also to capture information like the arguments passed methods and the return values and number thought and we want to be able to begin tracing the free beginning now the some Java analysis tools out there right now that loudest to you attached to an already-running process and then start capturing information what synapses for dealing with malware because you may miss some really important things the very beginning of the now a program before you can actually attached to it and also if there is an by debugging work going on by then our process by the time you actually attached to it it might be too late Meenakshi action might not even be able to attach a because made the process terminated itself and lastly we'd ideally not want to have to transform the java bytecode in memory the reason being just like the entity debugging topic and mentions and it's possible for now a to detect that's it's been modified memory it's by code the so imagine there are several options are ready and we have be traced and the requires the user traits in Java to build 2 and trees in the program and by could visualize a requires eclipse so not to light weight at source and extensible mesenchyme of return values and crown is very heavy weight doesn't properly show arguments return values excetera so there are a lot of great tools appear as a way of a 30 minute talk so much about 320 but um as good as me these tools are another me all of our requirements so and what is Our solution why are solution is something that we built from the ground up and it is a Java journal and running on top of the debugging framework 3 wrote called high stress so so what is pi special place 1st so you can see in the bottom right in the green and blue blocks and it is a transport clients that can go over TCP IP were shared memory and debug interface on top of that to communicate directly with the JVM know everything on the left actually ships with the standard issue edition of job of the Java Virtual Machine Tools interface the job debug water protocol agents and his job at you but purple agents which is part of the job and can actually communicate with the debugger process over what's called the Java debug wire protocol and again this can be done over TCP IP for local or remote debugging sessions or if you're running on a locally but session Windows you can you to a shared memory it's a little more performance now because reusing a well-defined interface that is actually supported by Oracle would afterward that cooking things which is nice because you know once you start relying on that maybe the next version of whatever you're trying to look no addresses change function names change with every you were cooking may not be reliable so by relying on a well-defined interface and chances are and this is gonna be a lasting solution it's to work for a long time from another thing to note is that but both the price prosody but transfer in the enterprise press to debug interface are written entirely in Python and they're absolutely no dependencies so means a few things it means that there's no need to install any third-party packages there is no need for external Java debug interface wrappers and here's the thing that I really like if you don't debugging a remote job a process you don't even need to install Java on your host system that's pretty cool you can debug jammer processes without having to install Java on your host system so I like that a lot and days the cool thing because of the modular design of place present job a journal on top of it and the actual debug loop and logging functionality in Java Journal bridge all job job journal itself it's about hundred lines of Python so I was able to write this cool to watch all of them on a moment and just about 100 lines of Python and because were so modular with this we can actually write other programs and top page press as well so maybe somewhere out here in the audience will next week right passive profiler or the week later dynamic debugger it's really very easy to write things on top of this debugging framework to but In this talk let's get the demos so before even get the demos I mean we think about so between seeing a previously recorded demo showing a play for about a single live demo and you have a preference I want to see the light demo instead article get good at that so what's figure a really easy sample to start with so this is the largest Hello World sample but all it does is it just you know when you run it it prints out Hello world pretty straightforward so let's see let's see we have so the Lucas job a journal Conrad once
you room full screen OK yeah so here
my M I have to jar is for 2 different demos and Ivan Hello World jar which is this source that I just showed you and compiled into a jar and I have pi espresso and I Java journal so if I go ahead and run actually need see if I can magnify this but may change the solution the m tha
no but until the end the the the move full speculation in perfect no what's not
all the stroma time beautiful
spring and what's not gonna play along all of us is which 1 other thing the yeah madness that but that's
a little better OK so i'm range of a journal specify pejorative hello world that are and I'm saying you know what at you and they were going to begin our button at the very beginning of the Chadians execution of I would need to see all the on the JVM intervals that actually occur before my my jars executed so say from start the output that's that whole worlds when the class loaded so long as it met again this is maybe a year ago 200 welcome and so I see a lot of output and and see if I can move the window where I can easily move the polymerase crawl up this roller enough or I should say ch the string Hello World being printed now what sort of business environments say what happened not so it looks like here
because the output is required to file I can branch copy output impedance u as a to me like pad + + and use and wrapping of the functions are collapsing the functions and I would see that eventually you call to Javadoc that I had a print stream that depends and with the string Hello World symmetry 1 other sample so this is similar to what I discussed earlier this is on a snippet of code of function from and in our family called wins it's also known as j saccade and aliens by in like a dozen other names of this mn our family but it's a it's pretty prevalent right now monitoring in Java they use example of this this function so I I I I J socket it takes as input an encrypted string call for many places throughout the code and you can see that the the the decryption key for the string is actually calculated dynamically based on on the stack so again very difficult to try to reverse engineer this of acridity obfuscator statically but we know that this is in the package for B . J soccer . be so we can this java Journal again this time by
specifying a slightly different is say on run the jar and when the giant and only show at the method calls and 4 methods in or duchies psychic that the stars the compressed enter ants running in my now and OK just random and I can see all the calls to that that any methods in that package I can see the input strings and veteran cryptid back and see the cryptid return values but now again doing the statically would be extremely time-consuming but that I'm able to now see the markets so looking up and then decrypting program files that string of virtual box guest additions and promotes again on that for the for VM word tools and how it seems to me that Célestin at the crypt adjust the process terminate after that so it is based on this right now I'm lecturing inside of him where I the I could pretty much assume that this is looking to see if it's running in the Anglican verb inward tools it is it terminates so again determined statically just by looking at the decompilation would have been very very difficult and very time consuming but by running it through Java journal I can see dynamically what's happening the and again I could see
the other the log file that was craters output and and and see little bigger what actually happens now it's so what are the takeaways then so the good news is you can download this right now so it's about 4 thousand lines of Python code begins 0 dependencies and it's very well documented to so we have about 400 kilobytes worth of HTML documentation for this you can download it from github odds-on on pipeline as well so you can just put your command line type install espresso the new have of some things to note 1st of all but it's still in alpha which means just like oracles official JVM you should not run it in a production environments of seriously the there and it an alpha not every code that has been tested so use it at your own risk but in addition to being very well documented below and you also have the and the Java Journal sample application they're releasing fully open source you well commented as well Semantic look at that get a feel for how to write your own programs on top of place for us as with any product there are always more things that can be done so some of things were looking to do now and but better inspection method arguments for a peak frames and I kind of looking at things more natively the way that the piece stacked as a that is a little work that can be done to improve object to object abstraction to make things more Python e and and it would be become a nice to built automatically have the option to attach to child processes that you're debugged jump process creates and of course you know right now you it's just right now off a text based output and it would be pretty nice of someone out here wants to create a very nice theory for it a similar again to something like really tabs CPI monitor so with that I am happy to take any questions that you have all thank you