Comprehensive Reverse Engineering Tool

Note: Please Scroll Down to See the Download Link.

Abstract

This project aims to develop a tool that will reverse engineer existing java code into meaningful class diagrams. The tool must show various class relationships, viz inheritance, aggregation, is-a, has-a etc.

One of the common requirements in industry is to understand an existing system to work on it later. Understanding occurs at various levels, viz. business needs, high level working, operations with other systems etc. One of the ways in which an overall picture of the system can be gained is with UML Diagrams. Surprisingly, several modern systems do not have this documentation at all.

This project aims to reverse engineer existing code into usable class diagrams.

As an option, this service can be exposed as a web-service. Following three tasks can be performed with the application:

The Policy Component is responsible for creating, reading, updating, and deleting the data associated with policies.

The Underwriting Component is responsible for the data collection and determination of risk for an insured party.

Overview

One of the common requirements in industry is to understand an existing system to work on it later. Understanding occurs at various levels, viz. business needs, high level working, operations with other systems etc. One of the ways in which an overall picture of the system can be gained is with UML Diagrams. Surprisingly, several modern systems do not have this documentation at all.

This project aims to reverse engineer existing code into usable class diagrams.

As an option, this service can be exposed as a web-service. Following three tasks can be performed with the application:

The Policy Component is responsible for creating, reading, updating, and deleting the data associated with policies.

The Underwriting Component is responsible for the data collection and determination of risk for an insured party.

Reverse engineering

Reverse engineering in this context means, that the UML tool reads program source code as input and derives model data and corresponding graphical UML diagrams from it (as opposed to the somewhat broader meaning described in the article "Reverse engineering").

Some of the challenges of reverse engineering are:

The source code often has much more detailed information than one would want to see in design diagrams. This problem is addressed by software architecture reconstruction.

Diagram data is normally not contained with the program source, such that the UML tool, at least in the initial step, has to create some random layout of the graphical symbols of the UML notation or use some automatic layout algorithm to place the symbols in a way that the user can understand the diagram. For example, the symbols should be placed at such locations on the drawing pane that they don't overlap. Usually, the user of such a functionality of a UML tool has to manually edit those automatically generated diagrams to attain some meaningfulness. It also often doesn't make sense to draw diagrams of the whole program source, as that represents just too much detail to be of interest at the level of the UML diagrams.

There are language features of some programming languages, like class- or function templates of the C++ programming language, which are notoriously hard to convert automatically to UML diagrams in their full complexity.

Existing System

The existing system is a reverse engineering which has a goal is to automatically determine if an implementation is consistent with the original design. In the system described, XML Metadata Interchange (XMI) representations of Unified Modelling Language (UML) class diagrams are recovered from compiled Java class files. These are automatically compared with the corresponding diagrams produced during forward engineering by software engineers using CASE tools. Examples are provided in which reversed engineered UML class diagrams differ from those produced during forward engineering but are still faithful to the original design intent.

Limitations of Existing System

For a given specification there may or may not be many valid designs with corresponding implementations. If an implementation conforms to its requirements but deviates from the documented design, this suggests that system maintainability may be compromised. Therefore, there is a need for an automated auditing tool to prove consistency.

Proposed System

The test-plan is basically a list of test cases that need to be run on the system. Some of the test cases can be run independently for some components and some of the test cases require the whole system to be ready for their execution. It is better to test each component as and when it is ready before integrating the components.

It is important to note that the test cases cover all the aspects of the system (ie, all the requirements stated in the RS document). You may want to split requirements further for ease of coding.

Problem Definition

This project aims to develop a tool that will reverse engineer existing java code into meaningful class diagrams. The tool must show various class relationships, viz inheritance, aggregation, is-a, has-a etc.

Advantages of Proposed System

The advantage of code inspections is to improve system quality and maintainability by ensuring that an implementation conforms to corporate standards and accepted industry best practice.

Modules

ü  Class Decompiler

ü  Plugin/ Eclipse IDE Integration

ü  Class Diagram Generator

ü  Metrics

Class Decompiler

Decompiler is a program which translates executable programs (the output from a compiler) into source code in a (relatively) high level language which, when compiled, will produce an executable whose behavior is the same as the original executable program. By comparison, a disassembler translates an executable program into assembly language (and an assembler could be used to assemble it back into an executable program).

Decompilation is the act of using a decompiler, although the term, when used as a noun, can also refer to the output of a decompiler. It can be used for the recovery of lost source code, and is also useful in some cases for computer security, interoperability and error correction.

Design

Decompilers can be thought of as composed of a series of phases each of which contributes specific aspects of the overall decompilation process.

Loader

The first decompilation phase is the loader, which parses the input machine code or intermediate language program's binary file format.

Code generation

The final phase is the generation of the high level code in the back end of the decompiler. Just as a compiler may have several back ends for generating machine code for different architectures, a decompiler may have several back ends for generating high level code in different high level languages.

Software Requirements

Operating System                     :                  Windows XP/2003 or Linux/Solaris

Programming Language            :                  Java

Graphical User Interface           :                  Swing

Data Structures                        :                  Java Collections

IDE/Workbench                        :                  Eclipse with MyEclipse Plug-in

Hardware Requirements

Processor                                  :                  Pentium IV

Hard Disk                                 :                  40GB

RAM                                         :                  256MB

Click here to download Comprehensive Reverse Engineering Tool source code