My name is Mairsh Jones. I am associated with circuit design and online database development from past 9 years. Mairsh has posted 1 posts at DZone. View Full User Profile

Three Control Flow Obfuscation Methods for Java Software

11.23.2010
| 6964 views |
  • submit to reddit

Java compilers translate Java source code into ‘.class’ files, which contain the Java bytecode for the classes. Much of the information about the source code is kept in the class files. Since the appearance of the first Java decompiler , the threat of reverse engineering has become worth noting. Without proper protection, class files can be easily decompiled and reverse-engineered into Java source code .

One approach against reverse engineering is obfuscation. Obfuscation is a process that keeps a program’s functions but makes it difficult to decompile, rendering decompilers unable to derive usable source codes from class files. The program that performs obfuscating transformations automatically is called an obfuscator. Obfuscation techniques are categorised as lexical obfuscations, data obfuscations, layout obfuscations and control obfuscations.

Lexical obfuscations modify the lexical structure of the program, typically, scrambling identifiers. All the meaningful symbolic information of a Java program, such as classes, fields and method names, is replaced with meaningless information. An example of one such program is Crema , a Java obfuscator.

Data obfuscations modify the data fields of a program. For example, it is possible to replace an integer variable in a program with two integer variables. Layout obfuscations involve obscuring the logic inherent in a program. Examples are scrambling identifier names, removing comments and debugging information. Control obfuscations make the control flow of a program difficult to understand. For example, the opaque predicate complicates the control flow by using conditional instructions that are always true (or false). The always true conditional instructions will branch to the original codes, whereas the false instruction will branch to codes arbitrarily inserted by the obfuscator.

Control obfuscations are categorised as control aggregation obfuscations, control ordering obfuscations dispatcher obfuscation and control computation obfuscations and further explained in what follows. Control aggregation obfuscations change the way in which instructions are grouped together. Inlining and outlining are two of the most effective ways by which methods and invocations of them can be obscured. Control ordering obfuscations change the execution order of instructions, for instance loops can be set to sometimes iterate backwards instead of forwards. Dispatcher obfuscations first flatten the control flow of a program.

The structure of the flattened program becomes a dispatcher and some basic blocks. The dispatcher implements the control flow. Some NP-complete or PSPACEcomplete methods are introduced in the dispatcher to cloak the program,whichmake the dispatcher difficult to trace. Control computation obfuscations hide the real control flow of a program. One example is that instructions that have no effect can be inserted into a program.

Control computation obfuscations are further categorised as smoke and mirrors obfuscations, high-level language breaking obfuscations and alter control flow obfuscations. Smoke and mirrors obfuscations hide the real control flow behind instructions that are irrelevant, for example, by inserting dead codes into a program. High-level language breaking obfuscations introduce features at the object code level for which there is no direct source code equivalent. For example, Java does not have a goto statement. Inserting goto instructions at the bytecode level could render decompilers unable to find suitable flow graphs.

Alter control flow obfuscations take a sequence of low-level instructions to construct an equivalent description at a higher level, thus removing abstractions from the program. For example, a for-loop in the C language source code can be transformed into an equivalent loop that uses ‘if ’ and ‘goto’ statements. sequences. They can be regarded as patterns to fail Java decompilers. They are implemented in our obfuscator to test the decompilers and for comparison to our approach.

Above article is written by Mr. Mairsh Jones, who provides freelance custom essay writing services.

Published at DZone with permission of its author, Mairsh John.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Claude Lalyre replied on Fri, 2010/11/26 - 1:27pm

Clear, precise and concise ! A really impressive post !

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.