-
Notifications
You must be signed in to change notification settings - Fork 23
DLL Decomp Notes
The Windows build of Pikmin 1 was compiled using MSVC 6.0. (TODO: Intns always said this, but I don't remember the source to this claim)
TODO: Explain how we scraped symbols from the ILKs. Explain how other symbols are known due to DLL exports.
TODO: Explain the purpose of each executable binary (sysBootup.exe, sysCore.dll, plugPiki.dll, maybe the others too?)
TODO: I think at some point we were able to determine the exact build flags from some build artifact left on the disc. Is this true? Can we document that?
TODO: We should probably explain how to access the shared Ghidra repository on here.
TODO: Meeo! Yes, the one writing this TODO! Write about what the stack tells us about named variables! local_XX vs iVarX! Unnamed temporaries that end up on the stack! Constructor patterns! Definitive proof of ternaries? this pointer backed up only at first spot it needs to. Primitive types taken by reference (e.g. f32&) look like named locals, but are actually unnamed temporaries.
TODO: Statements, unlike parameters, are evaluated left to right? Or is that just basic operator precedence rules?
TODO: Can it be proven or at least heavily suggested when += ... is used vs = ... + ...?
TODO: NsCalculation::calcMatrix3f appears separately from NsCalculation::calcMatrix and NsCalculation::calcJointPos in the ILK, because it truly has zero XRefs in the DLL. Can we explain that? NsCalculation::calcMtxDirect, another unused function, appears next.
Navigate to Edit > Tool Options > Options > Decompiler > Analysis. Uncheck "Eliminate unreachable code". Fuck you, Ghidra.
In all seriousness, MSVC debug builds do not strip any dead code. while (true) compiles as as MOV EAX, 1, TEST EAX, EAX, and JZ. Crucial to this point: if (false) compiles as XOR EAX, EAX, TEST EAX, EAX, and JNZ. Returns with inaccessible code after compile as a JMP instruction to the stack frame teardown. If you don't disable this "feature", Ghidra only warns you (at the top of the pseudocode window) that it removed inaccessible code blocks at a given address.
While you are here, also check "Use inplace assignment operators" to immediately improve your user-experience with Ghidra. It allows Ghidra to recognize increment (++) and decrement (--) operators, plus +=, -=, *=, /=, etc.
Local functions (i.e. static, e.g. _Print and _Error) are emitted by MSVC after the first function to use it. If no function uses it, I think it will never be emitted.
Weak functions (i.e. inline) seem to appear in the order they were declared; I assume after global functions are emitted (in what TU?). This order also usually resembles the order of symbols scraped from the ILK (but not always?).
Global functions (i.e. non-static) appear in the same order as in the GCN build with no interruptions (save for emitted static functions / that thing where MetroWerks reverses the order of function definitions in a TU, I forget the setting). This is very useful for locating UNUSED functions identified by the MetroWerks linker map.
If you are decompiling a function that only survived in the DLL (but has a known size via the MetroWerks linker map) and what you have decompiled matches the UNUSED size stated in the plate comment (and you are reasonably certain what you have written is correct), append "(Matching by size)" to the plate comment above the function to make it known that the function is as correct as we (currently) can guarantee. For example:
/**
* @todo Documentation
* @note UNUSED Size: 00004C (Matching by size)
*/
String literals ARE YOUR FRIEND! If you need to locate a function that hasn't been identified yet, look for string literals in that function (or adjacent functions using what you just learned above) and search program memory for them (Search > Memory, or keybind 's'). Note that Ghidra doesn't support anything beyond ASCII in the memory search for strings for some reason, so if you're searching for Shift-JIS strings it's super inconvenient. Sometimes encoding the string to bytes and searching program memory for those bytes can work. Sometimes...
Vtables are different in MSVC's ABI. In MetroWerks, all virtual member function pointers start at an 8 byte offset into the vtable (offset 0 points to the RTTI, offset 4 is always a word-sized zero unless something, idk). In MSVC, the virtual member function pointers start at offset 0. So basically, always subtract 8, right? Well, not when there's DLL-exclusive member functions (e.g. ATX Server stuff). Sorry! Here's a pro-tip, however. In the constructor for a class, you can pretty easily identify the vtable in memory when it is being written to the class. I like to copy the address table from Ghidra's disassembly listing into a text document so I can manually count out the offsets of each member function. Vtables are also super great for identifying unlabeled virtual methods! Ghidra's C++ support sure is great, right? /s
Making proper classes and member functions is actually possible (though not well advertised) within Ghidra. For example, consider ayuID (which is currently already proper class in the shared Ghidra repo). Here is the step by step way to make the member functions of a class such as ayuID act properly in Ghidra:
- If one does not already exist, create an
ayuIDstruct in the "Data Type Manager" (bottom left of the default layout) by right clicking on the "plugPiki.dll" data type library(?) and choosing New > Structure. You can search for if one already exists with the "Filter" field - In the Structure Editor window that pops up, fill the "Name" field with the name of your class (e.g.
ayuID) and optionally fill out any members of the class you are aware of (you can always change this later by choosing to edit the structure from the "Data Type Manager"). - In any member functions of
ayuIDthat you have found in the program's memory, right click on the function's name and choose Rename Function (hotkey 'L'). - If the function is already in a namespace but it is the wrong one, set the "Namespace" combo-box to Global. In the "Enter Name" field, type in the fully namespace-qualified name of the function (e.g.
ayuID::Set) and click "OK". You should see that the scope name is a cyan color while the member function name is a royal blue. - You only have to do this step once. Once any functions exist in the namespace, find the namespace by name in the "Symbol Tree" widget (top left of the default layout) (use the "Filter" field to locate things by name). If you see a green circle with a 'C' next to your namespace name, it is already a class. If you see two black curly braces, it is still a namespace. If it is a namespace, right click on it and choose "Convert to Class". The functions in this namespace (now class) with the
__thiscallcalling convention can now implicitly add (to their function parameters) athispointer to the structure with the same name as the namespace you just converted to a class. - This part sucks. Ghidra isn't very smart about updating calling conventions now that you've told it about a new class. In any member functions' "Decompilation" window, look for the
__thiscallspecifier between the return type and the function name (at the start of the function). If it is not already there, right click the function name and choose "Edit Function Signature". Change the "Calling Convention" combo-box to__thiscalland fix the incorrect parameters by hand (referencing decomp and/or the ILK symbol dump). Sometimes, Ghidra has the "Use Custom Storage" checkbox checked because it is being extra stupid. Uncheck that if you can, because it's pretty much always wrong. I think this happens when Ghidra hasn't yet finished updating some internal data when you go to edit the function signature, or maybe it happens when you make a function Ghidra thought accepted no parameters into a__thiscall(I'm not 100% sure). If Ghidra fights you over it because of "register allocation" or some shit I don't fully get, the fastest way to fix this is by unchecking it, rechecking it, clicking "OK" (this closes the window), reopening the "Edit Function Signature" window, and then finally unchecking the "Use Custom Storage" checkbox one more time. If the__thiscallspecifier is already there but there is not an implicitthispointer to the struct-associated class/namespace, updating anything in the "Edit Function Signature" window should get Ghidra to wake up and update its shit. Just rename a parameter or something. Fuck, I love/hate this decompiler.
A lot of functions in the DLL already are named things like Snake::drawShape, but the entire thing is the function name (including the ::). This is a hold-over from other contributors importing documentation from IDA Pro, which handles things differently. One thing I failed to mention is how the "Edit Function Signature" menu will actually allow you to make symbols in exactly this same way because... reasons. You must use the "Rename Function" menu for Ghidra to put the function in a namespace. Anyway, you can put these IDA-imported symbols into a namespace proper by renaming the function and pressing "OK" with no changes. A strange side-effect of this: if the function is already a __thiscall, Ghidra just... half-asses how it updates the function parameters? You'll see something like Boss *param_1_00, instead of Snake* this. To fix this, enter the "Edit Function Signature" window and change something, be it a parameter name, a parameter type, a return type, etc. This will get Ghidra's ass in gear and correctly update the function signature. If you see void* this, that means the namespace has not yet been converted to a class (follow the instructions above).
The comment /* WARNING: Load size is inaccurate */ in the decompilation window is also usually a clue that you forgot to convert a namespace into a class.
In Ghidra's "Disassembly Listing" window, you will frequently see plate messages like the following:
**************************************************************
* Stack 0xCC poison stubbed (__thiscall) *
* 8d 7d bc b9 11 00 00 00 b8 cc cc cc cc f3 ab *
**************************************************************
**************************************************************
* Stack 0xCC poison stubbed (__stdcall) *
* 8d bd bc fb ff ff b9 11 01 00 00 b8 cc cc cc cc f3 ab *
**************************************************************
**************************************************************
* `_chkesp` call-site stubbed *
* 3b ec e8 f7 2a 24 00 *
**************************************************************
These are the output of a two GhidraScripts I (Minty Meeo) wrote to make Ghidra suck less at decompiling MSVC debug binaries: ChkespKiller.java and StackPoisonKiller.java. In short, these were easily-identifiable runtime debugging routines that absolutely choked up Ghidra's decompilation engine, and it only really became possible to use Ghidra for decompilation once they were stubbed with NOP instructions. One really neat thing about the StackPoisonKiller script was it was possible to definitively(?) identify __thiscall functions and functions that were not (labeled __stdcall in what I hope was not ignorance). The hex dump of bytes on the second line of the plate comment is the original binary instructions that were overwritten, should it ever become necessary to restore them (so far it hasn't been).
The DLL is well-regarded for how nothing ever inlines, but operator new is an exception to that rule. Ghidra's decompiled pseudocode will usually show operator new's appearance in code like the following:
alloc_result = System::alloc(CLASS_SIZE);
new_counter = 3;
if (alloc_result == 0) {
new_result = 0;
}
else {
new_result = SomeConstructorFunction(alloc_result, ...);
}
new_counter = 0xffffffff;new_counter is a MSVC-generated variable seemingly for debug builds to count how many times operator new has been used in that function's scope. new_counter is initialized to -1 extremely early in the function, before the ExceptionList stuff and even before base class and member initialization in constructors. The first time operator new is used, new_counter = 0; the second time, new_counter = 1; and so on. new_counter is reassigned the value -1 after each operator new call is done.
in sysCore.dll, it would appear operator new is not inlined at all, for instance in AyuCache::AyuCache. This reveals that operator new has a custom calling convention which accepts the std::size_t count parameter via the EAX register.
In MWCC, vtables are added to a class exactly as soon as virtual methods are mentioned. In MSVC, vtables are put as close to a zero-byte offset as is possible. For example:
struct MyStruct
{
char a[4];
virtual void MyMethod();
};In MWCC, the vtable of MyStruct will be at offset 4 (following MyStruct::a). In MSVC, the vtable of MyStruct will be at offset 0 (because nothing prevented it from doing so. Consider this more advanced example:
struct PodType
{
char b[4];
};
struct MultipleInheritance : PodType, MyStruct
{
};In MWCC, the vtable will be at offset 8 of MultipleInheritance (following PodType base class and MyStruct::a). In MSVC, I assume the vtable will be at offset 4 of MultipleInheritance (following PodType base class and at the start of base class MyStruct), but I don't think I've actually seen proof of this yet.
MSVC passes parameters to functions from right to left. So for example (this is from PcamCamera::printInfo, ignore the fact that the printf is technically wrong):
gfx.texturePrintf(font, x, y, "%2d,%3d,%4.0f,%4.0f,%4.0f,%3.2f,%3.2f,%3.2f",
90 - int(NMathF::r2d(mPolarDir.mInclination)), int(getFov()),
mCurrDistance, mPolarDir.mRadius, mStoredRadius, angle, getTargetDirection());Calling PcamCamera::getTargetDirection will be the first thing you see happening at this call site. font will be the second to last parameter you see being passed in before finally gfx is passed as the this parameter. This is helpful to know for complex virtual function calls where Ghidra just kind of has to guess a few things and may not even accurately show all of the parameters being passed to the virtual function due to other complex things going on with the stack at that time.
The MSVC ABI implements float to long conversion with _ftol, a helper function from MSVCTR(D).dll with a unique calling convention. If you encounter _ftol in Ghidra and it is decompiling in a nonsensical way, edit the function signature to accept a float parameter, return a long parameter, enable "Use Custom Storage", and modify the float parameter's storage to be the register ST0:10. See also: https://github.com/NationalSecurityAgency/ghidra/issues/1246