Skip to content

xyquest4018/xywrite4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

122 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

✅ COMPLETED: Binary-Exact EXE Stub Reconstruction

Current Build Status (February 27, 2026 — COMPLETE)

Metric Value Target Status
Full EXE size 681,824 681,824 ✅ MATCH
Stub size 369,568 369,568 ✅ MATCH
Relocations 597/597 597 ✅ 100% MATCH
MZ header 14/14 14/14 ✅ 100% MATCH
Code+Data diffs 0 / 366,496 0 ✅ 100% MATCH
Overlay 312,256/312,256 312,256 ✅ 100% MATCH
String match 6,805/6,805 6,805 ✅ 100% MATCH
Reloc zone opcodes 17,159/17,159 17,159 ✅ 100% MATCH
MD5 09BEBE491B015EDE4CEA210469F863CC Match ✅ PERFECT MATCH

STATUS: 100% BINARY-EXACT MATCH — ACHIEVED February 27, 2026 This is the first bit-perfect reconstruction of XyWrite IV v4.018 in history.

The MZ EXE stub (369,568 bytes) has been reconstructed to produce a byte-identical binary. MD5: 09BEBE491B015EDE4CEA210469F863CC

Target Binary

  • File: EDITOR.EXE (681,824 bytes = 369,568 stub + 312,256 overlay)
  • MD5: 09BEBE491B015EDE4CEA210469F863CC
  • Structure: MZ EXE stub (43 segments, 597 relocations) + overlay binary
  • Architecture: x86-16 real mode with .386 / .486 (MOVSX, SETcc, BT, BSWAP)
  • Language: 100% hand-written assembly — NO C runtime, NO compiler artifacts

Build System (build/ folder)

Folder structure:

build/
├── BUILDOVL.BAT       Assembles + links 35 overlay files
├── BUILDEXE.BAT       Assembles + links 64 core stub files
├── CORELINK.RSP       Linker response for core stub (64 OBJs)
├── OVLLINK.RSP        Linker response for overlay (35 OBJs)
├── CORE/              Core stub ASM source (64 files + SEGMENTS.INC + XYOPCDES.INC)
├── OVERLAY/           Overlay ASM source (35 files + OVLSEGS.INC)
│   └── OUTPUT/        Overlay build output (OBJ, LST, OVERLAY.EXE, OVERLAY.BIN)
├── BIN/               Core stub build output (OBJ, LST, EDITOR4.EXE, EDITOR.EXE)
├── ORIGINAL/          Reference binaries (EDITOR.EXE, OVERLAY.EXE, OVERLAY.BIN)
├── SCRIPTS/           Post-build PowerShell scripts
│   ├── MAKEEXE.PS1    Master: STRIPHDR → PADHDR → AUDIT
│   ├── STRIPHDR.PS1   Strip MZ header from OVERLAY.EXE → .BIN
│   ├── PADHDR.PS1     Pad stub header + combine + patch checksum
│   └── AUDIT.PS1      Full binary verification audit
└── TOOLCHAIN/         Local MASM 6.11 + LINK 5.13

Build Process (2 Phases)

Two-Phase Build Flow

Phase 1 — DOSBox-X:
  └→ BUILDOVL.BAT  (35 OVL ASM → OVERLAY\OUTPUT\OVERLAY.EXE)
  └→ BUILDEXE.BAT  (64 CORE ASM → BIN\EDITOR4.EXE)

Phase 2 — PowerShell: SCRIPTS\MAKEEXE.PS1
  └→ STRIPHDR.PS1  (strip MZ header → OVERLAY.BIN)
  └→ PADHDR.PS1    (pad header 0xA00→0xC00 + append overlay + patch)
  └→ AUDIT.PS1     (MD5 + byte verification → PERFECT BINARY MATCH)

#### Phase 1: Assembly + Link (DOSBox-X)

```batch
mount C F:\RE\REVERSE\XyWrite\XyWrite4\build
mount D F:\RE\REVERSE\XyWrite\Tools
C:

1. **BUILDOVL.BAT** — Assembles 35 overlay ASM files from `OVERLAY\`, outputs OBJ/LST to `OVERLAY\OUTPUT\`, links to `OVERLAY\OUTPUT\OVERLAY.EXE`
2. **BUILDEXE.BAT** — Assembles 64 core ASM files from `CORE\`, outputs OBJ/LST to `BIN\`, links to `BIN\EDITOR4.EXE`

#### Phase 2: Post-Processing (PowerShell)

```powershell
cd F:\RE\REVERSE\XyWrite\XyWrite4\build
.\SCRIPTS\MAKEEXE.PS1

MAKEEXE.PS1 runs three steps:

Step Script Action
1 STRIPHDR.PS1 Strip MZ header from OVERLAY\OUTPUT\OVERLAY.EXEOVERLAY.BIN
2 PADHDR.PS1 Pad stub header (0xA00→0xC00) + append overlay + patch checksum
3 AUDIT.PS1 MD5 + byte comparison vs ORIGINAL\EDITOR.EXE

Expected Output

*** PERFECT BINARY MATCH ***
MD5: 09BEBE491B015EDE4CEA210469F863CC
Size: 681,824 bytes

Binary Architecture

Offset 0x00000–0x00BFF  MZ header (3,072 bytes, padded from 0x0A00)
Offset 0x00C00–0x5A39F  Stub load image (43 segments, 366,496 bytes)
Offset 0x5A3A0–0xA675F  Overlay (34 segments, 312,256 bytes)
Total: 681,824 bytes, MD5 09BEBE491B015EDE4CEA210469F863CC

XYWrite4 – Refactor & Modernization TODO

Code Documentation

  • Comment all assembly routines (purpose, inputs, outputs, side effects)
  • Add file-level headers describing each module
  • Document calling conventions used across routines
  • Explain any non-obvious optimizations or low-level tricks

Code Organization

  • Split large assembly files into smaller, logical modules
  • Group related subroutines into dedicated files (e.g., text handling, I/O, UI)
  • Standardize file naming conventions
  • Ensure each file has a clear, single responsibility

Labels & Naming

  • Identify all ambiguous or auto-generated labels
  • Rename labels to meaningful, descriptive names
  • Establish naming conventions for:
    • Functions
    • Local labels
    • Global labels
    • Constants and macros
  • Remove unused or duplicate labels

Build System

  • Organize build scripts for modular assembly files
  • Ensure reproducible builds
  • Add debug vs release build configurations
  • Document build steps clearly

Code Cleanup

Reverse Engineering / Understanding

  • Map out high-level architecture of the original codebase
  • Identify core subsystems (editor, rendering, input, file I/O)
  • Document data structures and memory layout
  • Trace key execution paths (startup, editing loop, save/load)

Testing & Validation

  • Verify behavior matches original XYWrite functionality
  • Create small test cases for critical routines
  • Check edge cases (large files, unusual input)
  • Validate stability after refactoring

Modernization Preparation

  • Identify reusable algorithms and logic
  • Separate platform-dependent code (DOS-specific parts)
  • Mark areas for future C++/Qt reimplementation
  • Define clean interfaces for porting components

Project Management

  • Track progress per module
  • Maintain changelog of refactors
  • Set milestones for incremental cleanup

Bugs & Limitations

  • Identify and document all existing bugs
  • Reproduce bugs consistently with test cases
  • Categorize bugs (critical, major, minor)
  • Identify architectural or design limitations
  • Document constraints imposed by legacy DOS environment
  • Fix confirmed bugs systematically
  • Refactor or redesign areas causing major limitations

Feature Planning

  • Identify missing or desirable features for XYWrite4
  • Prioritize features (core vs optional)
  • Ensure new features align with lightweight philosophy
  • Avoid feature bloat—define strict inclusion criteria
  • Create a roadmap for feature implementation

Toolchain & Assembly Accuracy

  • Identify the original assembler used (if possible) for XYWrite sources

  • Evaluate modern compatible assemblers (MASM, TASM, NASM, FASM)

  • Select the assembler that produces the most accurate binary output

  • Identify and configure a compatible linker for the chosen assembler

  • Ensure the build toolchain replicates original binary behavior as closely as possible

  • Audit all db-encoded instruction sequences

    • Example: db 81h, 0FFh, 6, 0 → replace with cmp di, 6
    • Example: - db 32h, 0E4hxor ah, ah - db 8Bh, 0C8hmov cx, ax - db 36h, 29h, 0Eh, 3Ah, 37hsub word ptr ss:[0x373a], cx - db 7Eh, 2jle <label> - db 0F3h, 0A4hrep movsb - db 8Bh, 0F3hmov si, bx
  • Replace raw opcode (db) sequences with proper assembly mnemonics wherever possible

  • Identify why raw opcodes were originally used:

    • Assembler limitations
    • Optimization tricks
    • Self-modifying code
    • Macro/workaround behavior
  • Validate that rewritten instructions produce identical machine code

  • Use disassembly tools to verify correctness of transformations

  • Preserve behavior in edge cases (flags, segment overrides, etc.)

  • Document any instructions that must remain as raw opcodes and explain why

  • Establish guidelines for when db usage is acceptable vs prohibited

  • Create automated or semi-automated process for opcode-to-mnemonic conversion (if feasible)

  • Ensure final codebase is readable, maintainable, and assembler-friendly

About

This is the first bit-perfect reconstruction of XyWrite IV v4.018 in history.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages