Static extractor and decompiler for PyInstaller-packed executables, written in Rust. Designed for malware reverse engineering — no Python runtime required.
$ unpy malware.exe -o out/
[*] read 41711939 bytes from malware.exe
[*] cookie at offset 0x27c78eb
[*] archive start at offset 0xaaa00
[*] Python version: 312 (python312.dll)
[*] 1281 TOC entries
[*] extracted 2591 files
[*] report written to out/report.json
unpy combines two separate steps that analysts usually run manually:
- PyInstaller extraction (what pyinstxtractor does) — PE → raw
.pycfiles - Bytecode decompilation (what pycdc does) —
.pyc→.pysource
The real value is not magic — it's the structured output: modules are automatically
classified into stdlib / libs / project, a JSON report is generated, and the
decompilation chain (pycdc → pycdas fallback) runs without manual intervention.
If you already have a working pyinstxtractor + pycdc setup, unpy is not a replacement. It's a convenience wrapper with analyst-oriented output.
- Rust + Cargo — to build unpy
- pycdc — bytecode decompiler, must be in PATH
pycdc is not always available via package manager and is not actively maintained. Build it from source if needed: github.com/zrax/pycdc
git clone https://github.com/pinonym/unpy
cd unpy
cargo build --release
# binary at target/release/unpyunpy <input.exe> [OPTIONS]
Options:
-o, --output <DIR> Output directory [default: <input>_extracted]
--no-libs Skip stdlib and third-party modules
--pycdc <PATH> Path to pycdc binary [default: pycdc]
--uncompyle6 <PATH> Path to uncompyle6 binary (used for Python <= 3.8)
--python-version <VER> Force Python version (e.g. 312) instead of autodetect
-v, --verbose Show per-file decompilation status
out/
├── src/
│ ├── project/ ← attacker's custom modules — start here
│ │ ├── mainscript.py
│ │ └── encryptor.py
│ ├── libs/ ← known third-party (discord, requests, …) — verify these
│ └── stdlib/ ← Python built-ins — usually safe to ignore
├── pyc/ ← raw .pyc files, mirroring src/ structure
│ ├── project/
│ ├── libs/
│ └── stdlib/
├── binaries/ ← native .dll / .pyd / .so
└── report.json
Use --no-libs to extract only project/ and skip stdlib/third-party noise.
For each .pyc:
- pycdc — attempts full Python source reconstruction
- uncompyle6 (Python ≤ 3.8 only, if
--uncompyle6is provided) — better support for older bytecode formats - pycdas (fallback) — triggered if all above fail. Gives readable bytecode disassembly.
Each module in report.json includes which decompiler was used and the final status.
{
"python_version": 312,
"total_modules": 1327,
"decompilation": {
"ok": 232,
"partial": 0,
"disasm": 1095,
"failed": 0
},
"suspicious_count": 15,
"modules": [
{
"name": "mainscript",
"status": "ok",
"decompiler": "pycdc",
"category": "project",
"suspicious": false,
"path": "out/src/project/mainscript.py"
},
{
"name": "requestss",
"status": "disasm",
"decompiler": "pycdas",
"category": "project",
"suspicious": true,
"path": "out/src/project/requestss.py"
}
]
}Status values:
ok— clean pycdc decompile, readable Python sourcedisasm— pycdas fallback, bytecode disassembly (pycdc failed or produced no code)partial— pycdc produced warnings, pycdas also unavailablefailed— both decompilers failed
suspicious flag: set when a module name in project/ is within Levenshtein distance 2 of a known stdlib or third-party package name — potential typosquatting or name confusion attack.
⚠ Third-party packages in
libs/can also be backdoored. The classification is a best-effort heuristic, not a guarantee. Always verifylibs/modules that appear in sensitive code paths.
- Python: 3.0 – 3.13
- PyInstaller: 3.x – 6.x
- Binary format: PE (Windows
.exe) — ELF/MachO not yet supported - Encryption: PyInstaller 4.x
--keyAES encryption is detected but not decrypted; affected modules will fall back to pycdas disassembly
- ELF (Linux) and Mach-O (macOS) binaries not yet supported
- PyInstaller 4.x AES-encrypted bundles: modules decrypt only at runtime, unpy cannot recover plaintext
- pycdc support for Python 3.12+ opcodes is still incomplete — expect heavy pycdas fallback on recent samples
- The stdlib/third-party classification is a static heuristic list, not exhaustive
Extraction and .pyc output work correctly, but all decompilation steps fail on
Python 3.6–3.7 bytecode due to two separate upstream bugs:
-
pycdc / pycdas : does not handle
FLAG_REF(0xe3) in the marshal parser — the0x80bit is not stripped before type dispatch, causingstd::bad_caston virtually all real-world 3.6 binaries. -
uncompyle6 / decompyle3 : xdis 6.1.x marshal parser crashes on
FLAG_REFcode objects.
FLAG_REF is a CPython marshal optimisation that marks objects interned in a reference
table during deserialisation. It became pervasive in PyInstaller-packed 3.6 binaries.
unpy will still extract all .pyc files and emit a warning in report.json.
Workaround : strings on the binary for quick triage, or python3.6 -m dis if
a Python 3.6 interpreter is available.
No dynamic analysis, no VirusTotal submission, no scoring. The tool extracts and decompiles. The analyst reads.