Thursday, 5 June 2014

Decompiling Python programs

The EditWad program [link] was written in Python 2.5 and packed into an *.exe file using py2exe [link].

Packing a Python program into an *.exe file means the end user doesn’t need to have Python installed on their PC to run the program.

Because Python programs are compiled to bytecode and not machine code it should be easier to decompile a Python program than a program that is compiled to machine code like a FreePascal progam or a C++ program.



An internet search for ‘Decompiling+Py2Exe’ led me to this thread at Stackoverflow. [link]

I downloaded the two programs mentioned in Extreme Coders’ post. I don’t think Python is needed to run either of them.
  • Py2ExeDumper [link]
  • Easy Python Decompiler [link]
I installed Python [link] on my PC so I could:
  1. run the decompiled Python program, and
  2. run the Python script suggested in the Py2ExeDumper “INFO.txt” file to extract the program bytecode.
I installed version 2.7 since it is the last version of Python backwards compatible with version 2.5.

I opened two File Explorer windows. One for the EditWad folder and one for the
Py2ExeDumper folder then dragged “EditWad.exe” onto “Py2ExeDumper.exe”

Click and drag EditWad onto Py2ExeDumper

A command prompt window should appear briefly with messages and a file called “PYTHONSCRIPT” should have been created in the EditWad folder.

File named PYTHONSCRIPT created

The “INFO.txt” file contains the following Python script for extracting the program from the “PYTHONSCRIPT” file.

import marshal, imp

f=open('PYTHONSCRIPT','rb')
f.seek(17)  # Skip the header, you have to know the header size beforehand

ob=marshal.load(f)

for i in range(0,len(ob)):
    open(str(i)+'.pyc','wb').write(imp.get_magic() + '\0'*4 + marshal.dumps(ob[i]))

f.close()

If you look in EditWad’s folder you will see a file named “library.zip”. This file was created by py2exe.

The Python script in “INFO.txt” assumes that “library.zip” was packed into the *.exe file.

Since “library.zip” is separate in this case, I had to change the Python script as follows using the information in “INFO.txt”

import marshal, imp

f=open('PYTHONSCRIPT','rb')
f.seek(28)  # Skip the header, Header size is 28 = 17+11 bytes for characters, l i b r a r y . z i p 

ob=marshal.load(f)

for i in range(0,len(ob)):
    open(str(i)+'.pyc','wb').write(imp.get_magic() + '\0'*4 + marshal.dumps(ob[i]))

f.close()

I created a text document in EditWad’s folder, pasted the script with the change and saved the text document with a *.py extension.

Since I associated *.py files with Python when I installed it, all I have to do to run the Python script is double click on it.

Run the Python program listed in INFO.txt

A command prompt window briefly appears and two files named “0.pyc” and “1.pyc” are created in EditWad’s folder.

*.pyc files are Python bytecode files.

We have extracted the Python bytecode for EditWad from the *.exe and now we need to decompile the bytecode into Python source code.

Run Easy Python Decompiler, click “Decompile a File” and open “1.pyc”.

I guessed that “1.pyc” was the EditWad program since it is 104kB.

Use EasyPythonDecompiler to decompile the PYC file

Easy Python Decompiler creates a file named “1.pyc_dis” in the EditWad folder.

If decompiling fails you may see files with the suffix “_fail”.

Source code file created

“1.pyc_dis” is the the diassembled Python source code so rename the file with a *.py extension to run it.

I renamed the file as “editwad.py” and opened and executed it in the Scite Editor packaged with PortablePython 1.0 [link]. This version of Portable Python uses Python version 2.5.

Execute the Python source code

The program has some syntax errors but fortunately all were caused by incorrect syntax for exception catching.

All I did to fix the errors was delete the word “as” for the first error (shown in the screenshot above) and replace the word “as” with a comma “,” for all the others.

EditWad running

Before EditWad can import Metasequoia *.mqo files the MQOParser module has to be copied to the same folder as “editwad.py”.

If you unzip “library.zip” you will find the file “MQOParser.pyc”. This is a Python program to extract data from an *.mqo file.

Copy “MQOParser.pyc” to the EditWad folder.

You do not need to decompile “MQOParser.pyc” to run “editwad.py” but decompile it with Easy Python Decompiler if you want to study the source code.

Now you can learn some Python and add features to the program or fix my poor code ;-)!

py2exe was also used to pack my TRW Editor and MeshTree Editor Python 2.5 programs into *.exe files. In both cases “library.zip” was not packed into the *.exe.

TRW Editor and MeshTree editor used the wxPython package so to run them from source code you will need to add the correct version of wxPython to your Python installation.

I would like to point out that the progressbarClass class in EditWad was not written by me. The comments attributing the author must have been stripped but here they are from the source code of version 1.3.

class progressbarClass: 
    # Author: Larry Bates (lbates@syscononline.com)
    #
    # Written: 12/09/2002
    #
    # Released under: GNU GENERAL PUBLIC LICENSE
    #
    #
    def __init__(self, finalcount, progresschar=None):
        import sys
        self.finalcount=finalcount
        self.blockcount=0
        #
        # See if caller passed me a character to use on the
        # progress bar (like "*").  If not use the block
        # character that makes it look like a real progress
        # bar.
        #
        if not progresschar: self.block=chr(178)
        else:                self.block=progresschar
        #
        # Get pointer to sys.stdout so I can use the write/flush
        # methods to display the progress bar.
        #
        self.f=sys.stdout
        #
        # If the final count is zero, don't start the progress gauge
        #
        if not self.finalcount : return
        self.f.write('\n------------------ % Progress -------------------1\n')
        self.f.write('    1    2    3    4    5    6    7    8    9    0\n')
        self.f.write('----0----0----0----0----0----0----0----0----0----0\n')
        return

    def progress(self, count):
        #
        # Make sure I don't try to go off the end (e.g. >100%)
        #
        count=min(count, self.finalcount)
        #
        # If finalcount is zero, I'm done
        #
        if self.finalcount:
            percentcomplete=int(round(100*count/self.finalcount))
            if percentcomplete < 1: percentcomplete=1
        else:
            percentcomplete=100
            
        #print "percentcomplete=",percentcomplete
        blockcount=int(percentcomplete/2)
        #print "blockcount=",blockcount
        if blockcount > self.blockcount:
            for i in range(self.blockcount,blockcount):
                self.f.write(self.block)
                self.f.flush()
                
        if percentcomplete == 100: self.f.write("\n")
        self.blockcount=blockcount
        return

1 comment:

  1. There is also a perfect open-source Python (.PYC) decompiler, called Decompyle++ https://github.com/zrax/pycdc/

    Decompyle++ aims to translate compiled Python byte-code back into valid and human-readable Python source code. While other projects have achieved this with varied success, Decompyle++ is unique in that it seeks to support byte-code from any version of Python.

    Online version is also available: http://www.javadecompilers.com/pyc

    ReplyDelete